1 Introduction

This paper shows that the legacy of history is particularly pervasive in Spain. We provide evidence to show that a historical process that ended more than five centuries ago, the Reconquest, is very important to explain Spanish regional economic development. The so-called Reconquista is a milestone in Spanish history. For a period of almost eight hundred years that started in 711 with the invasion of the Iberian Peninsula by the Muslims, what is now mainland Spain experienced a process fairly akin to colonialism. Throughout this long period, and after an initial phase of mere resistance, the Christians located in the North gradually reconquered the Muslim lands and implemented measures to colonize the reclaimed territory. We argue that the rate or speed of the Reconquest, that is, whether the Christian frontier advanced rapidly or not, was a crucial factor affecting the type of colonization conducted in each territory and its corresponding initial political equilibrium. A fast rate of Reconquest is associated with imperfect colonization, characterized by an oligarchic political equilibrium, thus creating the conditions for an inegalitarian society with negative consequences for long-term economic development.

This paper is framed within a new stream of literature dealing with the long-term effects of frontier expansions. In a recent contribution, García-Jimeno and Robinson (2011) have proposed the “conditional frontier hypothesis” to explain the starkly contrasting outcomes derived from the frontier experiences in North America (Turner 1920) and Latin America (Hennessy 1978). According to this hypothesis, the consequences of the frontier depend on the initial political equilibrium existing in society at the time of the territorial expansion. In North America, where the prevailing social climate was relatively democratic and egalitarian, the frontier brought about individualism, self-government and aversion to social stratification, whereas in the more oligarchic societies of South America, the presence of a frontier reinforced economic and political inequality. Focusing on the historical border between Castile and the Nasrid Kingdom of Granada in southern Spain, Oto-Peralías and Romero-Ávila (2016) suggest that military insecurity is a factor that favors a political equilibrium biased toward the military elite in frontier regions, generating highly persistent differences in inequality.

This article introduces and tests the hypothesis that the political equilibrium among the colonizing agents may be endogenous to the scale of frontier expansion. This is because large territorial expansion allows the elite to play a dominant role in the process of colonizing the conquered lands. Applied to our case study, this became evident after the collapse of the Almohad Caliphate in 1212 following the Battle of Las Navas de Tolosa, which enabled the Christian armies to conquer vast swathes of territory in a short period of time. The outcome involved large frontier regions dominated by military orders and the nobility, with negative consequences for long-term development. In contrast, a slow frontier expansion was associated with a more balanced occupation of the territory and a more egalitarian social structure. This was so because smaller frontier regions favored the participation of individual settlers and the Crown in the repopulation, which would lead to better political institutions and a more equitable distribution of the land—as happened in the colonization of the Duero Valley, where settlers occupied land and obtained its ownership. As argued below, these initial differences in the patterns of distribution of economic and political power persisted over time, and led to divergent development paths across what are now the Spanish provinces.

In the empirical part of the paper, we create an indicator measuring the “rate of Reconquest”, which captures whether the Christian military conquests progressed rapidly or slowly when each province was reclaimed. We show that there is a robustly negative relationship between the rate of Reconquest and current per capita income across today’s Spanish provinces. This relationship does not simply reflect the fact that regions in the South are poorer, since the results survive the inclusion of latitude and many other geographic, topographic and climatic controls. The effect remains statistically significant when the regression analysis is extended to the level of municipality, even after controlling for province fixed effects. The results are not driven by a selection problem informed by the possibility that –for instance– the Christian kingdoms chose to conquer faster economically less attractive territories. A number of falsification tests show that there is no link between the rate of Reconquest and several indicators of pre-Reconquest economic development.

We also analyze the channels through which the rate of Reconquest has affected current income. The results suggest that structural inequality, caused by a high concentration of land and political power in the hands of the nobility, played a central role as intervening variable. This is consistent with the hypothesis formulated by Engerman and Sokoloff (2000, (2002) and Acemoglu et al. (2002), whereby a high concentration of economic and political power in a few hands has impaired modern economic growth because it precludes large segments of the population from participating in economic activity when the opportunity to industrialize arrived. The timing of the effect of the Reconquest is consistent with this hypothesis, since its negative effect became most apparent during the industrialization period. This interpretation is also congruent with the fact that at the onset of industrialization in Spain (around 1860) the negative impact of the rate of Reconquest was also present in some of the foundations of modern economic growth, such as human capital. A general conclusion of our analysis is that accelerated (and imperfect) colonization may create the conditions for an inegalitarian society, with negative consequences for long-term economic development.

Several other papers are to some extent related to ours. Chaney (2008) and Chaney and Hornbeck (2015) investigate the expulsion of about 120,000 Moriscos in 1609 from the Kingdom of Valencia. Chaney (2008) finds that persistent extractive institutional arrangements in former Morisco areas inhibited the development of the non-agricultural sector long after the adverse population shock. Chaney and Hornbeck (2015) provide evidence of Malthusian dynamics in early modern Spain by documenting persistent rises in output per capita as a result of the population decline caused by the expulsion. Tur-Prats (2015) finds that a historically-determined persistent geographical distribution of traditional family types (stem vs. nuclear) affects intimate-partner violence (IPV). Based on a historical account, she uses the stages of the Reconquest and a freedom of testation indicator as instruments for the different family types. Droller (2013) investigates the effect of migration and population composition on long-run economic development in the settlement of the Argentina’s frontier regions known as the Pampas. The channels through which historically higher shares of European population affects current output are associated with industrialization and the level of human capital measured through literacy rates.

This paper also contributes in several ways to a growing body of research that considers economic development as a long-term process with deep historical roots (Spolaore and Wacziarg 2013; Nunn 2014).Footnote 1 First, our case study is appealing in the sense that the historical process studied in this article is very remote in time. The Reconquest ended in 1492 with the fall of Granada yet, significantly, its effects remain visible today. Explaining the reasons for the effect of the Reconquest being so persistent, along with the channels through which it took place, are questions of general interest. Second, our work is also interesting because unlike most previous studies focusing on former colonies, it analyzes the experience of a developed economy that became a leading colonial power in the Mercantilist era of colonialism. Third, a particularity of the Spanish case is that over a long period of time its territory experienced a process very similar to colonialism. Thus, an analysis of the Spanish Reconquest is useful because it gives clues about the subsequent colonization of the New World. When Spain colonized Central and South America in the sixteenth century, it had all the experience gathered in the Reconquest and through the policies implemented in the occupation of Muslim lands. Therefore, while the recent literature has emphasized that Spanish colonial policies were significantly influenced by the preexisting indigenous organization in conquered areas (Engerman and Sokoloff 2002; Frankema 2010), it should not be ignored that the granting of large tracts of land to the nobility, for example, had a clear precedent in the homeland.Footnote 2

The remainder of the paper is organized as follows. Section 2 provides a brief historical overview. Section 3 describes the indicator for the rate of Reconquest and the other variables used in the paper. Section 4 presents the analysis of the effect the Reconquest has had on current economic development, while Sect. 5 provides several sensitivity exercises that include a municipality-level analysis. Section 6 analyzes the timing of the effect of the Reconquest, and Sect. 7 investigates the possible channels through which this effect occurs. Finally, Sect. 8 puts forward some implications, and concludes.

2 Historical background

An interesting feature of Spanish history is that for a period of almost eight hundred years the Iberian Peninsula experienced a process somewhat akin to colonialism.Footnote 3 In 711, what is now the Spanish mainland was invaded by the Muslims, who in a very short period of time occupied almost the whole of the Iberian Peninsula and created a Muslim domain that was known as al-Andalus. This western European Muslim territory achieved great economic and cultural development, and for most of the period under Moorish rule it was the most advanced country on the continent (Chejne 1999). With the passage of time, the Christian outposts located in northern Spain gradually conquered the Muslim territory in a process that lasted until 1492, with the fall of the Nasrid Kingdom of Granada. This long period of Christian conquest is known as the Reconquista. Military campaigns were followed by a process of colonization or repopulation of the new lands. The way in which the colonization was conducted had fundamental consequences for each region’s ensuing development.Footnote 4

The crucial outcomes of the repopulation process were how land was distributed and who held political power. Other potential aspects of relevance were the resulting level of population density, the degree of integration of the Muslim population, and the extent to which preexisting technologies were preserved. An important factor that decisively affected the outcome of the repopulation was the speed of the Christian conquests; that is, whether the Christian frontier advanced rapidly or slowly (Sobrequés 1972; Malefakis 1970). We call this factor “rate of Reconquest”. A slow process in this case is generally associated with a more complete and balanced repopulation. This is because a smaller area to be colonized favored the participation of individual settlers and the Crown in the repopulation, which led to better political institutions and a more egalitarian distribution of land. By contrast, a rapid process is associated with imperfect colonization (González Jiménez 2006). In this case, a larger area to be repopulated implied fewer resources were available relative to the magnitude of the task; that is, an insufficient number of settlers, as well as administrative and military difficulties to govern and defend the territory. This favored the participation of the nobility and military orders in the organization and defense of the new lands.

Figure 1 shows how the rate of Reconquest differs markedly across the different stages of this historical process. During the first three and half centuries of the Reconquest (from 711 to 1062) the Christian kingdoms conquered about 155,000 \(\hbox { km}^{2}\), while over the next two centuries (until 1266) the reconquered area almost doubled (about 287,000 \(\hbox { km}^{2}\)). Thus, the rate of Reconquest (i.e., the area reconquered divided by the duration in years of that period) was much slower in the first period (approx. \(441 \hbox { km}^{2}/\hbox {year}\)) than in the second period (approx. \(1407 \hbox { km}^{2}/\hbox {year}\)). These differences had profound consequences for the type of colonization conducted in each case.

Fig. 1
figure 1

The Spanish Reconquest (711–1492)

A slow rate of Reconquest implied that individual settlers with few economic resources could colonize the territory by themselves. This was the case of the repopulation of the Duero Valley, where the distinctive feature of this process was the predominance of private initiative; that is, a type of repopulation conducted by individuals who occupied land and acquired its property through the institution of presura or aprisio (i.e., apprehension of land). In general, this repopulation implied a more balanced occupation of the land, as reflected in the presence of a large number of small settlements that appear evenly distributed across the repopulated territory. It also led to the creation of a society with a democratic structure of free peasants with access to land (Vicens Vives 1969).Footnote 5 The Crown also found it easier to organize the repopulation when the area to be occupied was not large. Thus, in the lands comprised between the rivers Duero and Tagus the repopulation was to a large extent officially organized and conducted by the King through the creation of municipalities or councils (repoblación concejil), which delimited and distributed smallholdings among settlers (Ruiz-Maya 1979). When the repopulation was conducted by the Crown, the result was still beneficial to the peasantry, since land was relatively well distributed and cities remained under royal jurisdiction.Footnote 6

In addition, a smaller area to be repopulated (consequence of a slow rate of Reconquest) favored the preservation of Muslim agricultural technologies and the integration of the Muslim population. Indeed, the repopulation in Aragon was different than in Castile, largely due to the smaller area this kingdom reconquered. In this case, the King was able to carefully organize the colonization, and the nobility played a smaller role (Sobrequés 1972). In contrast to Castile, the repopulation of Aragon had such particularities as a higher concern for maintaining irrigation structures, greater respect for the Muslim population, and less reward for the aristocracy for their participation in the conquest and defense of new territories (Casado Alonso 2002; Vicens Vives 1969).

The above contrasts with the situation in the stages of the Reconquest comprised between 1062 and 1266, particularly in Castile, where the Christian conquests progressed much more rapidly. The larger frontier areas to be repopulated rendered it unfeasible to colonize through individual settlers. Likewise, it was also difficult for the King to be able to organize the repopulation on such a large scale. In this context, the Crown found in the military orders and the nobility the most “effective means of [occupation and] defense in the border region” (Forey 1984, p. 214),Footnote 7 with the latter groups being granted large estates and jurisdictional rights. This situation was intensified after the Muslim defeat at the Battle of Las Navas de Tolosa in 1212. In a short period of time (between 1225 and 1250), most of the southern third of the peninsula suddenly fell into Christian hands (Malefakis 1970). By the mid-thirteenth century, the Reconquest was almost complete, with the exception being the Nasrid Kingdom of Granada.

The magnitude of the frontier expansion profoundly affected the subsequent social reorganization (Sobrequés 1972; Malefakis 1970). “[G]iven the weak resources of the period, the Castilians had to deploy enormous effort in order to cater for the administration, defense, and economic development of these southern lands [...] Inevitably, the disparity between the magnitude of the task and the precarious resources available produced problems. One of these was the birth of the great landed estates” (Cabrera Muñoz 1989, p. 465); another was the concentration of political power in the hands of the nobility. It is thus no surprise that the concentration of landownership and the proportion of territory under the jurisdiction of nobles or military orders were the highest in the regions of Castile-La Mancha, Extremadura and Andalusia.Footnote 8 In addition, a rapid rate of Reconquest made it difficult to govern the Muslim population and preserve their agricultural technologies. Thus, the previously intensive agriculture of the Guadalquivir Valley dramatically changed after the expulsion of the Moors from Andalusia following the 1264 revolt, being replaced by an extensive agrarian sector dominated by olive groves and sheep (Vicens Vives 1969; Malefakis 1970).

The existence of a link between the rate of Reconquest and the type of colonization is clearly reflected in the pattern of settlements in Spain. A rapid rate of Reconquest means a scarcity of settlers and economic resources, which gives rise to an unbalanced occupation of the territory consisting of an urban structure of a disperse distribution of few settlements involving large jurisdictional areas. In this sense, López-González et al. (1989) have argued that the size of municipal areas tends to increase as the Reconquest progressed, with the largest being on the Castilian side of Andalusia. There is indeed a very positive relationship between the rate of Reconquest and municipal surface area (measured both in 1787 and 2011). Remarkably, the rate of Reconquest alone explains 61% of the variation in municipal area in 1787.Footnote 9 This provides additional support for the fact that the scale of the frontier expansion affected the pattern of colonization of the conquered lands in a manner that is consistent with our line of argumentation.

To sum up, the rate of Reconquest conditioned the type of colonization conducted in each region. A rapid rate favored a political equilibrium biased toward the nobility, creating societies with high levels of economic and political inequality—with other potential consequences being a low integration of the Muslim population and scant preservation of their technologies. In contrast, a slow rate of Reconquest led to a more balanced occupation of the territory and a more egalitarian social structure. We argue that initial differences in the type of repopulation created different development paths across today’s Spanish provinces, with implications for their current level of prosperity. Thus, we expect a negative relationship between the rate of Reconquest and current per capita income. After presenting the data used in the paper, the following sections test this prediction and provide evidence on the timing of the effect and the mechanisms at work.

3 Rate of Reconquest and other data

We construct a database for the 50 Spanish provinces that contains variables concerning the rate of Reconquest, current economic development, and many historical and geographic controls. Our main indicator for measuring the conditions and pace at which the Reconquest was made is labeled “rate of Reconquest”. It measures the total area of the stage of the Reconquest in which the province was conquered by Christians, divided by the duration in years of that stage of the Reconquest. Therefore, the rate of Reconquest is a ratio of the amount of reconquered area divided by an interval of years. Intuitively, it reflects the speed at which the Christian frontier advanced and, consequently, the level of colonization effort required for the effective occupation of the province.

We construct this variable as follows. First, using geospatial software we calculate the surface area of each stage of the Reconquest from detailed maps provided by Mestre-Campi and Sabaté (1998). In this first step, we differentiate between the areas conquered by the Kingdom of Castile and the Crown of Aragon. In what follows, for the sake of simplicity, we refer to these 16 Reconquest areas (9 for Castile and 7 for Aragon) as Reconquest stages. Regarding the initial area of resistance in northern Spain, since it was not effectively conquered by the Muslims and, therefore, not reconquered, we exclude it from the baseline analysis.Footnote 10 Second, we calculate the duration in years of each stage of the Reconquest as the difference between the dates associated with each one of the subsequent frontier lines depicted in the map of the Reconquest in Fig. 1. Third, we divide the surface area of each stage of the Reconquest by its duration in years. This provides a measure of the rate of Reconquest expressed in \(\hbox {km}^{2}/\hbox {year}\).Footnote 11 A high value of this indicator implies that the Reconquest progressed quickly in that stage. Finally, we impute the estimated value of the rate of Reconquest to the provinces located in the respective stages. Since the area of a province can partially cover more than one stage of the Reconquest, we calculate the proportion of the provincial area within each one of the respective stages. We then compute the weighted average of the rate of Reconquest for each province, where the weights are given by the percentage of the provincial area conquered in each stage. This renders a different rate of Reconquest for each of the 45 provinces, as shown in Fig. 1. Note, for instance, that if 50% of a province is reconquered rapidly, and the remainder slowly, our measure would reflect an average rate of Reconquest, rather than differentiate between both rates.Footnote 12 However, in the municipality-level analysis we will explicitly allow for within-province variation across municipalities, thus allowing for the possibility that different municipalities within the same province exhibit different rates of Reconquest. This more disaggregated analysis will enable us to better account for and understand the persistence side of our theory, since jurisdictional rights were granted at the local level and the evolution of land inequality is also inherent to the dynamics of each municipality. Note, in this regard, that provinces had limited competencies and were indeed regional branches of the central government.

The variable used to measure current economic development is the figure for GDP per capita in 2005 provided by the Spanish National Statistics Institute. This study also employs a number of variables that may act as potential channels for explaining the effect of the Reconquest, as well as measures of pre-Reconquest economic development and a wide array of climatic, geographic, topographic and historical controls. We present all these variables in the sections in which they are used. Their definitions and sources are provided in Table 9 at the end of the main text, while the descriptive statistics are reported in Supplementary Appendix B (Table A2).

4 The effect of the Reconquest on current development

4.1 Initial results

Table 1 contains the results concerning the effect of the Reconquest on current levels of GDP per capita. The following equation is estimated with ordinary least squares (OLS) and heteroskedasticity-consistent standard errors with small-sample correction due to the relatively low cross-sectional dimension:Footnote 13

$$\begin{aligned} Y_i =\alpha +\beta _1 \cdot Reconquest_i +\beta _2 \cdot X_i +\omega _i \end{aligned}$$
(1)

where \(Y_{i}\) is log per capita GDP in 2005 in province \(i, \alpha \) is a constant term, \(\textit{Reconquest}_{i}\) stands for our measure of the rate of Reconquest, \(X_{i}\) is a vector of control variables, and \(\omega _{i}\) is the error term. Entry 1 in Table 1 reports a highly significant, negative bivariate relationship between current GDP levels and the rate of Reconquest for the whole Spanish territory (50 provinces). However, we prefer to conduct the analysis with only 45 provinces, i.e., removing those provinces that were never occupied by the Muslims and, as such, are not representative of the dynamics of frontier expansion applicable to the rest of Spain. Hence, in what follows we focus on the reduced sample of provinces. As with the whole Spain, entry 2 reports a statistically significant negative link between current output per capita and the rate of Reconquest. Our measure of the Reconquest alone explains 26 % of the variation in current GDP per capita. This result indicates that the Reconquest is an important determinant of the current distribution of provincial output. We may compare two provinces with high and low rates of Reconquest to gain a sense of the size of the effect the Reconquest has had on current GDP per capita. For instance, Barcelona has a level of GDP per capita that is 48 % higher than Seville (24,782 vs. 16,782). The latter has a rate of Reconquest (expressed in \(100 \hbox { km}^{2}/\hbox {year}\)) of 21.94, while for the former it is 1.58. The estimate in entry 2, −0.017, indicates that Barcelona should be 41.4% richer than Seville (\(e^{0.346} - 1 \approx 0.414\)), which is very close to the real differences in income per capita. This result cannot be taken as conclusive, since the presence of potential omitted factors, if correlated with both the Reconquest and current economic development, would introduce an omitted variable bias in the relevant coefficient. Therefore, in the rest of this section we seek to exhaustively control for possible factors that may affect both the rate of Reconquest and current GDP per capita levels.

Table 1 The effect of the Reconquest on current development

A first set of controls is related to the biogeographic conditions 10,000 years ago, and the transition to early agriculture within the Neolithic Revolution. Accordingly, entry 3 introduces the percentage of provincial area covered by wooded steppe versus dry steppe. These were the types of Neolithic vegetation (as indicators of soil quality and agricultural suitability) that prevailed on the Iberian Peninsula in prehistory.Footnote 14 Entry 4 incorporates the predicted date of adoption of early agriculture using the information provided by Pinhasi et al. (2005) regarding the exact location of thirteen calibrated C-14 dates from Neolithic sites on the Iberian Peninsula.Footnote 15 Statistically, none of the Neolithic controls enters significantly, whereas the effect of the Reconquest remains highly significant and unchanged in size.

A second set of controls accounts for historical conditions that may be relevant factors omitted from our analysis. Entry 5 introduces a variable measuring the road density level in Roman times, which could affect the progress of the Christian conquests, and may also be related to local development potential. This variable enters insignificantly in the regression, without altering the effect of the Reconquest. Entry 6 controls for an indicator of pre-Reconquest economic development, namely, urban population density in 800.Footnote 16 Arguably, the Christian frontier could advance more slowly in more developed regions, because –for example—they offered stauncher resistance. The coefficient on urban population density in 800 is negative and statistically significant, while the effect of the Reconquest remains negative and statistically highly significant.Footnote 17 Following a similar reasoning, entry 7 controls for an indicator of the level of economic development (urban population density) just before the Christians conquered and colonized the territory. In addition, entry 8 includes a variable measuring the average urban population density in the Christian kingdoms at the time of the conquest. This variable sets out to reflect the general level of economic development of Castile or Aragon (depending on the case) immediately before the province was repopulated, since the type of colonization conducted could be affected by the conqueror’s level of prosperity at that time. A higher conqueror’s level of prosperity can also proxy for the fact that the attacking technology was more advanced.Footnote 18 These two last controls are insignificant in the regression, without affecting the coefficient on rate of Reconquest.Footnote 19

Entry 9 introduces an indicator measuring the number of centuries that the province was under Muslim domination, as a means to account for the legacy of being under Muslim rule for a longer time. Indeed, this may be a confounding variable since a longer Muslim domination could affect factors such as cultural values or the Spanish-Christian identity of the population. Interestingly, the coefficient on rate of Reconquest remains highly robust, while the new variable appears statistically insignificant.Footnote 20 Entry 10 introduces a dummy variable capturing whether the province once belonged to the Crown of Aragon. Certain institutional characteristics of this former kingdom may have had an impact on economic development. The dynastic union between the Crown of Aragon and Castile was forged in 1469 with the marriage of the Catholic Monarchs, but Aragon preserved its legal system and institutions until the War of Spanish Succession at the beginning of the eighteenth century. Arguably, these particularities during this early period could have influenced subsequent economic activity. Even though this historical control appears highly significant and positively related to current development levels, its inclusion does not affect our baseline results. Entry 11 introduces a dummy variable for Madrid, the Spanish capital, in order to control for the fact that its good economic performance may have been driven by its special administrative character.Footnote 21 As expected, the coefficient on Madrid is positive and highly significant.

We next control for various climatic, geographic and topographic factors that may be omitted from the baseline specification. Many scholars consider geography to be an important determinant of economic development (Gallup et al. 1999; Sachs 2003). Following Acemoglu et al. (2002), we may differentiate between simple and sophisticated geographic explanations. The first type considers factors such as climate (with effects on work effort), soil fertility, and diseases. It predicts persistence in economic outcomes because geographic factors are time-invariant. Sophisticated geographic hypotheses are more appealing because they allow for the possibility that some geographic factors have a changing economic role over time. Applied to the Spanish case, access to the Mediterranean Sea may have been more decisive during the Middle Ages, with subsequent access to the Atlantic through trade with the Americas, and more recently during the industrialization period to the Bay of Biscay. In addition, coal reserves played an important role during the industrialization period, but not all the provinces had their own reserves. Transportation costs—measured, for instance, through access to the sea or distance from major trading partners and industrial centers in Europe—could also have been more important during the nineteenth century, when commercial relations across regions and countries intensified. In order to dispel doubts, we next control for variables that may be associated with both sets of geographic hypotheses. We begin with factors exhibiting geographic variation along a North-South gradient that mimics the direction of the Reconquest. The incorporation of latitude, in entry 12, (which enters insignificantly) does not affect the statistical significance or size of the coefficient on rate of Reconquest. Therefore, our results do not simply capture the fact that southern Spanish regions are poorer.

Entries 13–15 control for such variables as temperature, rainfall and humidity, which may also affect soil quality and its suitability for crops that require large estates (and in turn induce the concentration of economic power in the hands of the landed elite). Higher aridity and less rainfall may also require a higher concentration of land on the grounds of economic efficiency and profitability (Brenan 1943). Hence, they may be factors that confuse the long-term effect of the Reconquest on development. It is worth stressing that the baseline results remain fairly unaltered, with only rainfall entering significantly. The baseline result remains unchanged when entry 16 introduces a direct measure of soil quality constructed on the basis of several dimensions (nutrient availability and retention capacity, rooting conditions, oxygen availability to roots, excess salts, toxicity and workability) from FAO/IIASA (2010) data, which enters with a highly significant and positive coefficient. Entries 17–19 exploit provincial variation in the suitability of land for such cash crops as sugar, cotton and tobacco in order to capture the possibility of a contrast in the suitability of land for large plantations in the South of Spain as opposed to the North (as in the US). It is worth noting that none of these three controls appears statistically significant or affects the main findings. The introduction, in entries 20 and 21, of average altitude and terrain ruggedness does not alter the baseline results either, with only the latter being marginally significant.

Entries 22–33 control for geographic attributes related to transportation costs that include access to the Mediterranean Sea, the Atlantic Ocean, and the Cantabrian Sea, a dummy indicator for being an island, a coast dummy, coast length over surface area, distance from the coast, border with Portugal, and the natural log of distances from Madrid and London, the latter being considered the technological frontier. Two other distances from locations that were arguably important for European development are included. They are distance from Mainz as a proxy for the spread of the printing press (Dittmar 2011), and distance from Paris, which can be considered the cradle of the Enlightenment movement that promoted the expansion and accessibility of useful knowledge as a cornerstone of industrialization (Squicciarini and Voigtlander 2015).Footnote 22 Of all these controls, access to the Cantabrian Sea, border with Portugal and log distances from Paris and Mainz are statistically significant and negatively associated with current development, whereas access to the Mediterranean Sea enters with a statistically significant positive coefficient. Most importantly, the effect of the Reconquest remains fairly robust to these additions. Entries 34–37 control for indicators accounting for natural resource endowments that include the percentage of agricultural land in 1900, the percentage of arable land in 1962, a coal dummy in 1860, and log coal output in 1860. Only the coal dummy is statistically significant and with a positive coefficient, whereas the baseline results remain unaltered.

4.2 Baseline specification and robustness checks

Column 1 in Table 2 includes in the same specification all the controls that are individually significant at the 10 % level or better.Footnote 23 This is our paper’s baseline specification. Even in this case, the coefficient on the Reconquest measure is significant at the 1 % level, and its size is only slightly reduced from −0.017 to −0.016. Besides, the Madrid indicator, soil quality and ruggedness continue to be statistically significant and positively associated with current development, whereas log distance from Paris has a statistically significant negative effect on current GDP per capita. The strength of the effect of the rate of Reconquest on current development is illustrated in Fig. 2 by a scatter plot of the two variables, after conditioning on the set of controls included in column 1. The partial R-square of the rate of Reconquest is 34.9 % in this baseline specification. It is remarkable that an indicator measuring a historical event that occurred many centuries ago has such a large explanatory power for explaining current income.Footnote 24

Table 2 The effect of the Reconquest on current development: robustness checks
Fig. 2
figure 2

Conditional relationship between current GDP per capita and rate of Reconquest

A typical concern of empirical analyses with a limited number of observations is the possibility that a few extreme cases drive the results. Columns 2–7 in Table 2 show that our findings are fairly robust to removing outliers detected by the following procedures: leverage, standardized residuals, studentized residuals, Cook’s distance, DFITS, Welsch distance, and DF-Beta. Likewise, the effect of the Reconquest remains fairly unchanged when particularly rich areas such as Madrid and Barcelona are excluded from the analysis (column 8). Similar results are obtained when employing robust estimation that corrects for the effect of outliers (column 9). Our baseline findings also remain robust to using a quantile regression approach (column 10), as a way to assess the existence of an effect at the median and not only at the mean of the distribution.

In addressing the concern that our results hinge on the particular indicator of Reconquest used, we re-estimate the baseline specification with three alternative indicators. First, an alternative indicator of rate of Reconquest that assigns to each province the rate of Reconquest corresponding to the Reconquest stage in which a province’s geographic centroid is located. By doing so, there is no need to calculate a weighted average of the rate of Reconquest, and standard errors can be clustered at the level of stage of Reconquest. Second, another alternative indicator of the rate of Reconquest that divides this historical process into stages of the same duration.Footnote 25 Third, a dummy variable indicating whether the province was reconquered after the collapse of the Almohad Caliphate in 1212 following the Battle of Las Navas de Tolosa, which enabled the Christian armies to conquer a vast territory in a short period of time. The results appear in columns 11–13 of Table 2. It is remarkable that the three alternative Reconquest indicators enter with a statistically significant negative coefficient, thus corroborating our baseline findings.Footnote 26

In Appendix E, we redo all the estimations in Table 2 with two other alternative small-sample corrections: (1) estimating standard errors through wild bootstrap, and (2) using the leverage-adjusted HC2 estimator recommended by Imbens and Kolesar (2012) and Samii and Aronow (2012). In both cases, our baseline findings remain largely unchanged. Another potential concern is the presence of spatial correlation, which may reduce the true precision of the effect. We re-estimate the models in Table 2 and check that the statistical significance of the coefficient on the rate of Reconquest is not reduced when using standard errors corrected for spatial dependence. For that purpose, we use the Jeanty (2012) Stata command—sphac—with a cutoff of 200 km (see also Allen 2015). Unaltered results to this change are reported in Appendix F.

Skeptics may still be concerned with the fact that the Reconquest is very correlated with a North-South gradient for Spain, with a richer North (particularly the Basque Country and Catalonia) and a poorer South (mostly Andalusia). This has been previously addressed in several ways. First, we exclude the three rich Basque provinces from the baseline analysis, which partially mitigates this problem. Second, we show that the effect of rate of Reconquest is robust to the inclusion of latitude, and log distances from London, Paris and Mainz. Third, we also omit such potential outliers as Madrid and Barcelona. In addition to the aforementioned robustness checks, (i) we incorporate a high-order (cubic) latitude/longitude polynomial into the baseline specification, with the coefficient on rate of Reconquest being robust to this addition. (ii) We regress the rate of Reconquest on the set of controls in the baseline specification, save the residuals and use them in a regression of latitude on the residuals.Footnote 27 It is worth noting that latitude appears unrelated to the residuals that are the part of the rate of Reconquest orthogonal to the controls, with an \(R^{2}\) of 0.001 and a p value associated with the coefficient on the residuals of 0.893. Likewise, once we control for the baseline control set, there is no relationship between latitude and Reconquest rate. All these results are reported in Table A9 in Appendix G. (iii) The next section conducts the analysis at municipal level controlling for province fixed effects and for dummies of deciles in latitude.

As an additional robustness check, we only exploit the variation from the 16 regions corresponding to the respective stages shown in Fig. 1 (9 in Castile and 7 in Aragon). This analysis is thus conducted with only 16 observations, in which the weighted average of output per capita in 2005 for the territory corresponding to each Reconquest stage (using provincial surface area in each stage as weights) is regressed on the rate of Reconquest at the stage level. As expected, there appears to be a statistically significant negative relationship between both variables.Footnote 28

5 Sensitivity analysis

5.1 Municipality-level analysis

Although the relationship between the rate of Reconquest and current GDP appears robust to the inclusion of many geographic and historical controls, as well as to the removal of outliers, a possible objection is that some unobservable province-level characteristics are driving this result. One way to address this concern is to conduct the analysis at a finer level, namely, using municipality data, and test whether the results hold even when conditional upon province-specific fixed effects. This test is quite strong, and allows us to exploit within-province variation in the conditions surrounding the Reconquest. The inclusion of such powerful fixed effects enables us to account for any systematic and structural particularities related to the history of each province, which cannot be controlled explicitly in a province-level analysis. It also provides an alternative way to deal with the issue of small sample. For this exercise, we create a dataset of more than 8,000 municipalities in Spain. As proxies for income at local level, we use current data for average socioeconomic condition, average number of vehicles per household, and labor force activity rate, which appear clearly linked to economic development. This is corroborated by the existence of a high correlation with GDP per capita at provincial level (the correlation is 0.81 with average socioeconomic condition, 0.54 with average number of vehicles per household, and 0.73 with labor force activity rate).

The municipality-level analysis is conducted with three different measures of rate of Reconquest computed at municipal level. First, the baseline measure is obtained by imputing to each municipality the rate of Reconquest corresponding to the Reconquest phase to which the municipality belongs. As with the province-level analysis, here we distinguish between the stages of the Reconquest in Castile (9 stages if we exclude the initial resistance area) and Aragon (7 stages). By exploiting within-province variation across municipalities, we allow for the possibility of different rates of Reconquest across a province’s municipalities. Second, we construct a dichotomous indicator of rate of Reconquest, which equals one if the rate of Reconquest corresponding to municipality i is higher than the provincial mean value. This allows us to exploit the discontinuity in rate of Reconquest across municipalities within each province, in a similar spirit to a border specification. Third, we proceed in a similar way, but exploiting those cases in which there is a stronger discontinuity. The binary indicator is now defined as one, if rate of Reconquest is higher than a 1.25-fold the provincial mean value.

Table 3 presents the results clustering standard errors at the level of stage of Reconquest. All regressions in Panel A include province dummies and a relatively large control set, which comprises the municipalities’ total population (in logs) to control for differences in municipal size, latitude, and geographic factors related to transportation costs, such as distance to Madrid, distance to the coast, and distance to the nearest provincial capital (all distances entering in linear and square form), and a provincial capital dummy, as well as several additional variables accounting for the municipalities’ climate, geography and topography. These include altitude, annual average temperature, annual rainfall, and seven dimensions measuring soil quality (nutrient availability and retention capacity, rooting capacity, oxygen availability to roots, excess salts, toxicity, and workability).Footnote 29 Despite the fact this municipality-level specification controlling for province fixed effects goes some way in addressing the North-South gradient concern—as variation in latitude within provinces is much smaller than when considering Spain as a whole—, we deepen into this issue by further incorporating latitude fixed effects (one dummy variable for each decile in latitude). By doing so, we are able to exploit variation within provinces and within each small range of latitude, i.e., within small North-South distances. These results are reported in Panel B of Table 3.

It is worth noting that the three different measures of rate of Reconquest are negatively associated with the three proxies for local development, in most cases at the 5 % significance level or higher. Interestingly, when the rate of Reconquest is constructed in a way that it captures a higher discontinuity, the negative effect becomes more pronounced, as expected. All these findings carry over to the more complete specification that incorporates ten latitude decile dummies. This alleviates our concern that unobserved heterogeneity at provincial level and/or a North-South gradient might be the driving force behind the significant effect of the Reconquest on current development found in the province-level analysis.Footnote 30

Table 3 Municipality-level analysis: province fixed-effects regressions

Since spatial correlation in this municipality-level analysis can be substantial, as an alternative to clustering standard errors at the level of stage of Reconquest, we redo Table 3 using standard errors corrected for spatial dependence following Jeanty (2012). We use a cutoff of 100 km beyond which spatial correlation is assumed to be zero. As an additional robustness check, we conduct the analysis with standard errors clustered at the province level rather than at the level of stage of Reconquest. Our baseline findings in Table 3 remain fully robust to these changes. Due to space considerations, these results are presented in Appendix J.

5.2 Falsification test and balancedness

This section conducts a falsification exercise to show that the rate of Reconquest is not negatively related to the level of economic development in the pre-Reconquest era. A main threat to the validity of our analysis is the possibility that areas conquered faster were initially poorer, which could have facilitated a rapid conquest. If those areas conquered faster were worse off even before the Reconquest, then the observed relationship between the rate of Reconquest and current income may be driven by the territories’ intrinsic characteristics, rather than by the type of colonization conducted by Christians. However, it is very unlikely that the rate of Reconquest hinged on the territories’ economic development, since the pace of the advance of the Christian frontier was arguably caused mainly by the relative military weakness of the Muslim territory in each period. Therefore, the rate of Reconquest was the consequence of an exogenous factor with respect to the territories’ economic potential.

Our aim is to verify that our indicator of the Reconquest does not have a statistically significant negative association with economic development and other outcome variables before the Reconquest. We measure pre-Reconquest development primarily through city population and urban population density in 800, which is the earliest year for which urban population data are available. Given that the Reconquest had hardly begun at that time, it serves our purpose. We also consider additional outcome indicators related to pre-Reconquest development. These include years since the transition to agriculture, ancient (pre-medieval) settlements over surface area, Roman road density (total roads and main roads), the ratio of the number of locations where imperial coinage was found to surface area, Roman villas over surface area, and density of bishoprics circa 600.

To assess whether these variables can be used as plausible measures of early development, we look at their correlation with an indicator of land suitability for agriculture—the percentage of agricultural area in 1900—, since pre-industrial prosperity is commonly considered to be related to soil fertility and, more specifically, to agricultural land potential. Remarkably, all the indicators—except for years since the transition to agriculture—are positively correlated with the percentage of agricultural area. In the case of city population and the density of urban population in 800, Roman road density—total and main roads—, presence of imperial Roman coinage, and Roman villas, correlations are statistically significant.Footnote 31 Very similar correlations follow when we employ the variable percentage of arable land in 1962 as a measure of land suitability for agriculture. These results indicate that most indicators of pre-Reconquest development reveal expected relationships with agricultural land potential, which makes us more confident about their reliability.

Panel A of Table 4 provides the results on the relationship between the rate of Reconquest and early development. It is worth noting that the rate of Reconquest is not negatively associated with any of the measures of early economic development, after conditioning on a meaningful set of controls.Footnote 32 Fairly similar findings follow when we look at the bivariate relationship between rate of Reconquest and pre-Reconquest development, which appears marginally significant at the 10 % level (though with a positive sign) only in the case of ancient settlements (see Panel B of Table 4). The above findings suggest that the effect of the Reconquest does not merely represent the perpetuation of differences in economic development that already existed before the Reconquest, or mean that provinces conquered more rapidly started off at a disadvantage or were intrinsically poorer.

Table 4 Falsification test: the effect of the Reconquest on pre-Reconquest development

We next present a balancedness table showing the correlation between rate of Reconquest and urbanization levels measured through density of urban population from 800 to 1850. The evidence shown in Panel A of Table 5 mostly points to a lack of a statistically significant relationship between rate of Reconquest and urbanization levels for more than a millennium.Footnote 33 Therefore, neither initial nor subsequent development prior to the arrival of industrialization around 1860 is clearly correlated with rate of Reconquest. This indicates two things. First, as already pointed out, those territories conquered faster were not initially poorer. Second, the adverse effect of a fast Reconquest on aggregate economic development did not become apparent before industrialization. We extend on this point in Sect. 6.

Table 5 Balancedness: bivariate relationship between rate of Reconquest and urbanization and land quality

Panel B of Table 5 further presents the bivariate relationship of rate of Reconquest with soil quality measured both at provincial and municipal levels, as well as with eight other measures of land quality and land productivity. With the exception of soil quality at province level, there does not appear to exist a statistically significant relationship. As regards the positive correlation between rate of Reconquest and soil quality at province level, one could argue that it is this confounding factor, rather than the pace of the Reconquest, that affected the concentration of economic power in the form of land (which is a main channel through which the effect of the Reconquest is found to operate) and in turn the level of development. However, there are reasons to believe this is not the case. First, our baseline specification already controls for soil quality. Second, there is not a statistically significant relationship of rate of Reconquest either with soil quality at municipal level, or with eight different proxies for land quality and productivity measured at province level. Third, it is clear that what matters for the concentration of land in large estates regions is the historical process of Reconquest rather than soil quality. This is because our data indicate the existence of a positive (instead of an expectedly negative) relationship between the extent of land inequality (measured through the percentage of landless workers over the total agricultural active population in 1797) and soil quality for the Spanish provinces, with a correlation coefficient of 0.62. This contrasts with the existing evidence that supports that areas with better soil quality historically experienced a higher demand for land, which should be conducive to higher land fragmentation (see Baten and Hippe 2013, and Cinnirella and Hornung 2013, for such evidence across the European regions and Prussian counties in the nineteenth century, and references therein). Hence, it is reasonable to think that had the Reconquest not occurred, the more fertile provinces would have given rise to small and medium-size holdings. Fourth, in the context of the two-stage-least-squares (2SLS) analysis implemented in Sect. 7—in which the rate of Reconquest is found to affect current development mainly through land inequality—, when historical land inequality is instrumented with soil quality (instead of with rate of Reconquest), it no longer affects current development. However, rate of Reconquest that entered exogenously would still exert a statistically significant negative impact on log GDP per capita in 2005. These results appear in Appendix L. This makes it clear that current output is affected by structural inequality stemming from the conditions surrounding the Reconquest rather than from soil quality.

6 The timing of the effect of the Reconquest

The above results confirm the strong and robust negative effect that the Reconquest has had on current per capita output. A question that requires further study is when this effect actually took place. This is a key issue because it provides clues about the nature and causes of the effect. On the one hand, if our findings were due to –for example– some geographic confounding factor, the effect of the Reconquest would probably be visible at all times.Footnote 34 On the other hand, the analysis of the timing of the effect is useful for considering the mechanisms at work. For example, if the main implications of the rapid advance of the Christian frontier were related to the destruction of Muslim technologies or to a lack of agglomeration economies due to low population density, the negative effect should have become apparent soon after the Reconquest.

To implement this analysis, we estimate a panel specification that regresses each province’s level of development relative to the national average over the 1000-2005 period on the interaction between rate of Reconquest and time dummies, with data measured at the beginning of each century up to 1800, and then at 1860, 1930, 1970 and 2005. The interactions start in 1500, which roughly corresponds to the year in which the Reconquest ended. The specification takes the form:

$$\begin{aligned} y_{i,t} =\alpha _i +\theta _t +\sum _{t=1500}^{2005} {\gamma _t} \cdot D_t \cdot { Reconquest}_{i} +\sum _{t=1500}^{2005} {\phi _t} \cdot D_t \cdot X_i +\varepsilon _{i,t} \end{aligned}$$
(2)

where \(y_{i,t}\) stands for each province’s relative level of development. For the periods prior to 1860 for which there are no available data on GDP per capita, we employ density of urban population. \(D_t\) is an indicator variable for each time period, \(\textit{Reconquest}_{i}\) represents the province-level rate of Reconquest, \(X_i\) includes those controls that may have a varying effect over time such as soil quality, access to the Cantabrian Sea, a coal dummy, access to the Mediterranean Sea and log distance from Paris, and as such they are interacted with the time dummies. \(\alpha _i\) and \(\theta _t\) represent province and time fixed effects, respectively.

As shown in Table 6, the panel specification including the interacted rate of Reconquest as well as time and province fixed effects renders a coefficient on rate of Reconquest that becomes negative and statistically significant since 1860, around the time when Spain entered the industrialization phase (Pascual and Sudriá 2002; Rosés 2006).Footnote 35 The interaction terms for the periods prior to industrialization enter with a negative, though statistically insignificant, coefficient. These results suggest that the adverse effect of a fast Reconquest became more apparent when industrialization arrived. The same essentially holds for the panel specifications that add interactions of time dummies with soil quality, access to the Cantabrian Sea, a coal dummy and access to the Mediterranean Sea, respectively.

Table 6 The timing of the effect of the Reconquest: regression results

In Appendix M (Table A17) we also estimate specification (2) with data only covering the 1860–2005 period. By doing so, we do not mix in the same specification two different proxies for economic development such as density of urban population and GDP per capita. The analysis is conducted with both relative levels of GDP per capita and relative levels of industrial output per capita, as alternative measures of province-level relative economic development. In this specification the interaction term for 1860 is omitted, since it is taken as the reference period. The evidence appears in line with that obtained for the specification covering the full period.Footnote 36 Appendix O pursues this question further by taking into account that the exact timing of industrialization in Spain may be endogenous. The unreported evidence indicates that the negative effect of a fast rate of Reconquest became more pervasive when the opportunity to industrialize arrived.

7 Mechanisms at work

In Sect. 2, we argued that the rate of Reconquest was a crucial factor affecting the outcome of the repopulation process. A rapid rate is generally associated with imperfect colonization, with negative consequences for each region’s subsequent development. The rapid advance of the Christian frontier made the task of repopulation more difficult and demanding, which originated several problems, such as scarcity of settlers and resources, defense requirements for vast territories, and the governance of a large conquered Muslim population. What follows describes the potential channels that may help explain the effect of the Reconquest on current development, as well as the way they can be measured. We also discuss the consistency of each alternative explanation with the observed timing of the effect.

7.1 Structural inequality stemming from land inequality and political power concentration

Spanish historiography suggests that two key outcomes of the repopulation process were how land was distributed and who held political power. This constitutes our main hypothesis concerning the main channel through which the Reconquest affected current development, and the argument deserves to be further developed. The rate of Reconquest affected the possibility that either individual settlers or the nobility and military orders gained control over the newly conquered territories. As historically documented, a greater area to be repopulated increased the likelihood that nobles and military orders were called upon to participate in the repopulation and defense of such vast territories. Consequently, a rapid frontier expansion favored an initial political equilibrium biased toward the nobility, which led to the concentration of political power—in the form of jurisdictional rights—and economic power—in the form of land—in the hands of this social group.

The consequences of this unequal distribution of economic and political power were pervasive. Jurisdictional rights provided the landowning nobility with the legal and political apparatus that afforded them de jure political power over the broad mass of the population. This meant the landless peasantry became attached to the nobles’ lands, and the judiciary, the right of taxation and local council were controlled by the nobility. Likewise, the nobility could run de facto extractive institutions aimed at exploiting the peasantry through such mechanisms as severe restrictions on land and grain transactions, labor contracts with caps on agricultural wages, land tenure systems implying short-term leases whose conditions were reviewed annually, and the obligation to use the nobles’ mill to grind the grain. In this context, institutions of equal opportunity and property rights access for the agricultural proletariat of large estates—who were the majority in southern Spain—were completely absent (Brenan 1943; Domínguez-Ortiz 1955). This created a society characterized by a high level of social and political inequality.

This situation persisted over time, in a clear process of path dependence. It can be explained by several factors. First, the decline in population after the Christian conquest due to migrations, the expulsion of the Muslim population, and epidemics favored the establishment and consolidation of a type of extensive agriculture based on large estates (Malefakis 1970). Second, the landed nobility used their political power to illegally usurp lands and monopolize common lands (Cabrera Muñoz 1989). Third, such inefficient institutions as the creation of entailed estates protected by law (mayorazgos) and other regulations made land non-conveyable, and jurisdictional rights were hereditary. The liberal reforms of the nineteenth century derogated the legal apparatus of the Old Regime, but unlike in other countries like France, they failed to suppress nobles’ landownership and hence change the balance of power in society (García-Ormaechea 2002). Finally, the process of disentailment of communal and ecclesiastical landownership known as desamortización aggravated the pattern of land concentration in a few hands because land was bought up at very low prices by the rich, the bourgeoisie, and nobles (Brenan 1943; Herr 1974; Carrión 1975).Footnote 37 In Brenan’s words, “this is the class that since 1843 has held political power in Spain—a middle class not enriched by trade or industry but by the ownership of land” (Brenan 1943, p. 109).Footnote 38

As argued by Acemoglu et al. (2002), when a major shock like the spread of industrial technology occurred with the arrival of the opportunity to industrialize, the landed elite may not support investing in the new technology for fear of losing its political power. The reasons are that potential entrepreneurs with productive ideas may not form part of the elite, and thus feel their property rights are not secured. Also, the landed elite may block these investments if those who mostly benefit from them are not part of the elite, thus preventing any shift in the balance of power toward the emerging capitalist class.Footnote 39 In the case of Spain, particularly in large estates regions, the broad mass of the population was poor and no strong bourgeoisie arose, as the entrenched nobility and the middle class preferred to devote their capital to buying large land lots. As a result of this, the industrial revolution largely failed, and unlike in other countries like Britain (Doepke and Zilibotti 2008), the landed elite did not see its power curtailed and no significant shift in the balance of power occurred. In contrast, in those regions that had a more equal distribution of economic and political power, like the Basque Country and Catalonia, the arrival of the opportunity to industrialize clearly shifted the balance of power toward the emerging industrial bourgeoisie.

According to this line of reasoning, the presence of extractive institutions that do not provide equal opportunity and property rights access for a broad cross-section of society became more important with the arrival of new technologies that required the economic participation of broad segments of the population, most of which were not part of the ruling elite. This appears to be the case with industrialization which, in order to succeed, would require the involvement of new entrepreneurs, innovators, and middle-class citizens.Footnote 40 Applied to the Spanish case, inequality in the access to land (a key historical factor of production) and the associated structural inequality in the access to economic opportunities (schooling, health care, access to credit, etc) precluded large segments of the population in large estates provinces from participating in economic activity when Spain entered the industrialization phase.Footnote 41 This contributed to the failure of southern Spain to industrialize (Nadal 1997; Nadal et al. 1987). For these reasons, the role of land inequality and political power concentration as mechanisms for explaining the effect of the Reconquest on income appears fairly consistent with the possibility that this effect became apparent during industrialization.

One might wonder whether the mechanism proposed is based on a conflict between the landed elite and the masses (as in Engerman and Sokoloff 2002, and Acemoglu et al. 2002), or on a conflict between the landed nobility and the emerging industrial elite (as in Galor et al. 2009). Arguably, we place more emphasis on the existence of a conflict of the landed nobility and the landless masses, which were excluded from participating in economic activity when the opportunity to industrialize arrived. Among others, Domenech (2012, (2015) provides evidence of the existence of rural conflict between the landed elite and the landless masses before the Spanish civil war. This does not preclude the possibility of a conflict between the landed and industrial elites. However, for the case of the large estates regions of Spain, we are skeptical about that possibility, since strictly speaking the industrial elite as a social group was very small. One of the reasons for this is that the middle classes preferred to buy disentailed land, rather than invest in industry or building the railway network. The implications of this prevalence of the landed elite were pervasive. By blocking education and equal opportunity access to the masses, the landowning nobility ensured excess of agrarian labor supply and cheap wages, thereby preventing a rural exodus to the cities. In addition, the existence of a broad mass of the population formed by impoverished landless workers, who lacked human capital and financial resources, was not conducive to the accumulation of capital and the creation of an agricultural sector that could provide a strong market for industrial goods (Tortellá 2000). Without having necessarily existed a conflict between the landed nobility and an industrial elite, all these factors negatively affected the possibility of successful industrialization in large estates regions (Tedde de Lorca 1985).

One might also wonder why the presence of extractive institutions for the landless majority may not exert an adverse effect on economic activity even before industrialization when an agrarian economic structure predominated. The reason is as follows. In an agricultural society (like preindustrial Spain) in which the main investment opportunities are in agriculture, economic and political inequality may not impair aggregate production. This is because “the elite can invest in land and employ the rest of the population, and so will have relatively good incentives to increase output” (Acemoglu et al. 2002, pp. 1272–1273). Along similar lines, Chaney and Hornbeck (2015), found for preindustrial Valencia that there was relatively high output per capita because fertility and mortality did not respond due to the presence of extractive institutions on the peasantry. Similar Malthusian dynamics are likely to apply to southern Spanish regions. In addition, in pre-industrial times, other factors such as soil fertility or environmental suitability may have been more important for production.Footnote 42 In this sense, until industrialization, the higher land fertility of some of the large estates regions was sufficient to make them stand among the wealthiest in Spain.Footnote 43 In short, the adverse effect of extractive institutions on aggregate production may be inconsequential in an agrarian economy, but not in an industrial one. That is why the negative effect of the rate of Reconquest mostly emerges from 1860 onwards.

We employ several variables to account for the sources of structural inequality. We measure political power concentration of the nobility—and in turn the extractive institutions to which it gave rise—with an indicator from the 1787 population census: the percentage of population entities (núcleos de población) under seigneurial jurisdiction that includes both nobles and military orders.Footnote 44 Land inequality is measured through the percentage of landless workers over the total agricultural active population measured both in 1797 and 1956, which proxy for the concentration of land in the hands of the nobles. The class of landless laborers, which can be traced back to the fifteenth century, was a by-product of the nobility’s high concentration of land (Cabrera Muñoz 1989).Footnote 45 For robustness purposes, Appendix R also presents the results with two alternative measures of land concentration: the percentage of arable land in holdings greater than 200 hectares in 1962, and a Gini index of land concentration in 1972.

7.2 Other potential intervening factors

The rate of Reconquest could also affect other factors of relevance to economic development. A first candidate is the extent to which the preexisting Muslim population was respected and integrated into the Christian kingdoms. A rapid frontier expansion made it difficult to govern and integrate this population, as became apparent with the great mudejar revolt of 1264, which led to the expulsion of the Muslim population from the Guadalquivir Valley. In addition to creating problems of labor scarcity, the fate of the Muslim population had important implications due to their higher human capital, particularly concerning the level of agricultural technology.Footnote 46 Moreover, the degree of assimilation of the Muslim population could also have cultural implications. Indeed, Chaney and Hornbeck (2015) document differences between Christians and Muslims in their preference for child quality vs. quantity (Galor and Moav 2002), as well as in fertility and mortality (Galor and Weil 1996). To measure this factor, the best we can do is use an indicator of the proportion of Moorish ancestry in the current population of each province. Using an admixture approach based on binary and Y-STR haplotypes, Adams et al. (2008) were able to identify the genetic differentiation of the population of the Iberian Peninsula and the Balearic Islands, finding a relatively high mean proportion of ancestry from North Africa (10.6 %). As opposed to the common expectation that a South-North gradient of North-African ancestry is followed, it is worth noting that the highest proportions of Moorish ancestry (greater than 20 %) are found in Galicia and Northwest Castile, which contrast with the much lower proportions in Andalusia.Footnote 47

A second potential channel through which the Reconquest might affect current development is the traditional family type distribution. Tur-Prats (2015) finds that those areas featuring traditional stem families, in which one son inherits all the land and cohabits the parental home along with his wife to continue the family line, are associated with lower IPV and greater gender equality. This contrasts with the higher IPV found in those areas in which nuclear families –whereby all children receive an equal share of the inheritance and leave the parental home to constitute independent households– are more prevalent. According to Tur-Prats (2015), stem families were dominant in the North because the early stages of the Reconquest gave rise to small and medium-size landholdings, which were preserved by free families through indivisible inheritance. However, as the Reconquest advanced further South, military orders and nobility were awarded with vast tracts of land, and the landless peasantry had no choice but to comply with the equal inheritance rules mandated by Castilian Law, thus giving rise to nuclear families. Therefore, the traditional family type mechanism may be confused with those related to the concentration of political power in the hands of the nobility or even to the extent of land inequality. We investigate the validity of this channel by measuring the historical distribution of family types through the average number of married and widowed women per household at province level from the 1860 census, as in Tur-Prats (2015).

A third possible mechanism that may affect current levels of development is the degree of market fragmentation. Grafe (2012) points to the exceptionally high degree of market fragmentation observed in Spain over the seventeenth and eighteenth centuries as the main obstacle to economic development. In addition, market fragmentation could be the consequence—at least in part—of accelerated colonization by, for instance, making it more difficult to maintain the pre-existing infrastructure network. We measure differences in the degree of market fragmentation across provinces by constructing an indicator of road density in 1760 at provincial level, with higher road density implying less fragmented markets. This indicator can also be used to test for possible differences in government investment in infrastructure across provinces.

One might also assume that the Reconquest generated historical differences both in the political power of the Church and in religiosity across provinces, which might have had some effect on current development. To control for this factor, we employ two indicators measured at the end of the eighteenth century: the percentage of population entities under Church jurisdiction, and the percentage of population that was a member of the clergy (both secular and regular). A related factor is the role played by the Inquisition, which was charged with preserving Catholic orthodoxy. Vidal Robert (2014) shows that inquisitorial activity is negatively associated both with urbanization rates at regional level and population growth at municipal level. However, a lack of consistent data for constructing an indicator for the majority of the Spanish provinces has prevented us from empirically assessing the role of the Inquisition in mediating the effect of the Reconquest.

Another mechanism that remains uncontrolled involves interregional migration, which is historically hard to measure. However, there may be reasons explaining why people do not move between regions to arbitrate the existing differences in economic development. One simple explanation may be found in Gennaioli et al. (2013, (2014), who develop a model in which there are frictions related to the limited supply of land and housing that prevent people from completely arbitrating away the differences in income. Besides, migration would act against our identification strategy, since if income differences were swept away because of interregional migration, we would no longer find an effect on current income differences, which would have vanished over time.

Finally, the rapid advance of the Christian frontier gave rise to sparsely populated territories due to a lack of manpower and settlers, which was aggravated by the eventual expulsion of the conquered population. However, strictly speaking, population density cannot be considered a channel to the extent that in a Malthusian regime it is strongly correlated with output per capita. Indeed, Chaney and Hornbeck (2015) provide evidence that early modern Spain was subjected to Malthusian dynamics after the Moriscos expulsion in 1609. Labor-scarce areas also gave rise to the creation of latifundia and shifts from grain to cash crops cultivation. An additional empirical problem is that it is impossible to distinguish which part of the effect of population density on current development works through political power concentration or the creation of large estates, or through other mechanisms such as agglomeration economies or technological progress à la Boserup.

The consistency between these alternative potential mechanisms and the observed timing of the effect of the Reconquest is theoretically less compelling than the case of the channel of structural inequality. Indeed, if the lack of agglomeration economies due to low population density, human capital depreciation derived from the expulsion of the Muslim population, market fragmentation, and differences in religiosity were relevant factors explaining the effect of the Reconquest, the timing of the effect should have been much earlier, instead of much later during industrialization.

7.3 Empirical analysis

Although the timing of the effect of the Reconquest provides some clues about the empirical validity of the proposed channels, we next analyze this question more systematically. For a variable to be a candidate for a channel, it needs to be correlated not only with the rate of Reconquest, but also with log GDP per capita. In addition, the effect of the rate of Reconquest needs to work via that particular channel. This is implemented through a 2SLS analysis that uses the rate of Reconquest to predict the channel variable in the first stage, and then regresses log GDP per capita in 2005 on the predicted channel variable, in both stages controlling for the baseline control set. The first and second stages are presented in Panel B and A of Table 7, respectively. Panel C reports the OLS regression of GDP per capita on the channel variable, which enables us to determine whether the selected channels have a large explanatory power for explaining current output levels, as occurred with rate of Reconquest in the reduced-form estimations. It should be pointed out that, strictly speaking, this 2SLS analysis does not represent an instrumental variables estimation.

As shown in Panel B, rate of Reconquest is positively correlated at conventional significance levels with the sources of structural inequality: land inequality as measured by the percentage of landless workers in 1797 and 1956, and the concentration of political power in the hands of the nobility as measured by noble jurisdictions in 1787. This is consistent with the fact that the faster a territory was reconquered, the more likely it was that the nobility was granted large estates and jurisdictional rights. Besides this channel, there is also evidence that a greater rate of Reconquest is significantly associated with a lower prevalence of population entities under the jurisdiction of the Church. This is because the concentration of economic and political power did not move hand in hand for the Church and the nobility. As widely documented in Spanish historiography, the clergy was important during the first two centuries of the Reconquest, whereas in the later stages of the Reconquest this power shifted to the nobility and military orders. This explains why the contribution of the Church to the repopulation of southern Spain was marginal compared to that of the other powerful groups. The reason for this must be sought in the opposition of the nobility to the acquisition of jurisdictional rights by the Church, because of the greater involvement of the former in the occupation and defense of frontier lands (Artola et al. 1978).

The second stage in Panel A shows that higher land inequality and a more unequal distribution of jurisdictional rights in the hands of the nobility are associated with lower current development. In addition, church jurisdiction is positively correlated with current GDP, which might be explained by the positive impact the Church may have had on the early spread of literacy. However, when we regress the literacy rate in 1860 on the percentage of population entities under church jurisdiction, after controlling for our baseline control set, there is no evidence to support the existence of a statistically significant positive link between both variables.

If we add to this the fact that i) there is no statistically significant relationship between church jurisdiction and GDP per capita in 2005 in the OLS regressions in Panel C,Footnote 48 and ii) the other religiosity indicator (percentage of population that was a member of the clergy) does not enter significantly in any of the estimation stages, we can to some extent rule out the empirical validity of the religiosity channel. Table 7 also provides consistent evidence across both estimation stages that other channels such as stem family prevalence, Moorish ancestry or historical road density are statistically insignificant. With the evidence at hand, this suggests that the traditional family type, the degree of integration of the Muslim population or their higher human capital concerning the level of agricultural technology, and market fragmentation are not relevant mechanisms explaining the long-term economic consequences of the Reconquest. In contrast, structural inequality, caused by high inequality in the access to a historical production factor like land and a high concentration of political power in the hands of the landowning nobility, appears to be the dominant channel through which the Reconquest affected current development.

Table 7 Mechanisms at work
Table 8 Outcomes indicators in the 1860s

7.4 Outcome indicators at the onset of industrialization

The evidence presented in this section largely supports the view that structural inequality plays a central role in explaining the Reconquest’s effect and why it became apparent during the era of industrialization. Table 8 provides additional evidence consistent with this hypothesis by focusing on the decisive moment in which Spain began industrializing. It is expected that some of the fundamentals of modern economic growth needed for industrialization to succeed were also undermined at the onset of the industrialization period. This is because such factors as a deficient education and health care precluded the broad majority of the population from participating in economic activity in those regions with an unequal distribution of land and political power.

Our dependent variables are a number of factors that are relevant for economic growth, all measured in the 1860s. They are two indicators related to education (literacy rate and school enrollment), two related to health (infant mortality and life expectancy), two associated with political participation (percentage of electors and voters), and two indicators related to social conflict (criminality and convicts). According to our view, we expect the rate of Reconquest—working through structural inequality—to lead to lower human capital (negatively affecting education and health), lower political participation, and higher social conflict.Footnote 49 This is precisely what we observe in Panel A of Table 8 that presents a 2SLS analysis that traces the effect of the rate of Reconquest on outcomes in 1860 through the channel of structural inequality measured via our preferred indicator given by the percentage of landless workers in 1797. Similar results are obtained with the OLS estimates of the reduced-form effect of the rate of Reconquest on outcomes in 1860 (Panel B of Table 8).Footnote 50 All in all, the evidence provided in Table 8 indicates that around 1860 historically rooted inequality had already created the conditions for the subsequent failure to industrialize.

8 Conclusions

The legacy of history appears particularly pervasive in the case of Spain. This paper shows the Reconquest in the Middle Ages to have been a major historical process shaping the distribution of regional income. The rate of Reconquest, which captures the magnitude of the colonization effort required in the period when each one of what are now today’s provinces was conquered by the Christians, has a robust and strong negative effect on current income. Our results are robust to controlling for historical controls and a wide array of climatic, geographic and natural resource endowments that account for simple and sophisticated versions of the geography hypothesis. Of particular interest is the lack of a significant effect due to differences in land suitability for plantation crops featuring economies of scale in production. Moreover, the effect of the rate of Reconquest survives the inclusion of latitude, log distances from key industrial centers, and several other methods to deal with the North-South gradient issue. The results also remain unaltered when employing several alternative indicators of the Reconquest. A municipality-level analysis that includes province-level fixed effects also provides evidence supporting the existence of a negative effect of the rate of Reconquest on economic development. In addition, a number of falsification tests indicate that the rate of Reconquest is not associated with indicators of pre-Reconquest economic development.

We argue that a rapid rate of Reconquest led to imperfect colonization, mainly characterized by a high concentration of power in a few hands. The evidence supports the view that a fast frontier expansion favored a political equilibrium biased toward the military elite (i.e., the nobility), which generated a high concentration of economic and political power, thus creating the conditions that led to the exclusion of large segments of the population from participating in the economic opportunities that opened up with the arrival of industrialization. The result was that provinces featuring an unequal distribution of economic and political power fell behind during the industrialization period. Thus, the Reconquest set in motion processes that generated persistent inequality, constituting a severe impediment to the requirements for modern economic growth, which is based on entrepreneurship, innovation, and the participation in economic activity of broad segments of the population.

Our results contribute to the novel literature on the political-economic effects of frontier expansions in that the existence of a large frontier that needs to be occupied and defended from the enemy may lead to a shift in the balance of power toward dominant groups, which may create the conditions for an inegalitarian society, with negative consequences for long-term development. This study of the Spanish Reconquest is also appealing from the point of view of the literature on colonialism, because it gives clues about the colonization of the New World. When Spain colonized Central and South America in the sixteenth century, it had the long experience gathered in the Reconquest. The policy of distributing economic power in the form of large estates, as well as of political power in the form of feudal rights, as applied in Spain since the mid-eleventh century (becoming widespread as of the thirteenth century) is a foretaste of what would later be implemented in the New World.

Finally, a question that deserves further research is why the effect of the Reconquest resulting from the pattern of colonization of the conquered lands is so persistent, even though today some sources of this problem are no longer present. The early obstruction of industrialization may have long-lasting consequences. Historical, economic, and political inequality may have affected the initial paths of industrialization and development and, once launched, different economic forces (e.g., increasing returns) reproduce the initial divergence. In addition, many social and cultural patterns developed in the past due to a high concentration of economic and political power may still persist today.