Keywords

Why Look at Trade?

International trade can be considered a “biased” iceberg that stands out from the national economy and extends into foreign countries. As a topic in economic history, it has spawned a huge literature and with good reason. Adam Smith (1776) argued that trade would increase the “extent of the market,” allowing for increased specialization and economic growth. David Ricardo (1817), inspired by the Methuen Treaty between Portugal and Britain which caused some specialization in port wine in the former and textiles in the latter, developed the concept of comparative advantage. He demonstrated, using the first mathematical model in economic theory, that since the opportunity costs of producing a good will differ in different countries, they can gain by trading and specializing according to their comparative advantages. Based on the trading patterns of the nineteenth century, which we will examine in more detail in the next section, Heckscher (1919) and Ohlin (1933) elaborated on the concept of comparative advantage, arguing that it was based on the relative endowments of different factors of production. More recently, new trade theory , particularly associated with the work of Paul Krugman (1979), has demonstrated how modern trade leads to trade in similar but differentiated goods, which is a gain for consumers, who have a love of diversity. Lastly, in as much as openness to trade leads to the spread of knowledge between countries, it can also lead to permanent gains in the growth of economies, rather than the one off gain from the exploitation of comparative advantages through a movement from autarchy to free trade.

Economic history can also allow us to nuance the work of economic theorists, however. It has been pointed out that the UK and the USA both developed under protectionist regimes, and similar points have been made more recently on the emergence of the so-called tiger economies of Southeast Asia. Thus, even the father of the Washington Consensus, John Williamson (1990b), concluded that one exception from the general rule that free trade is always best is infant industry protection, whereby emerging industries are offered temporary protection so that they can enjoy the so-called dynamic comparative advantages which are not available at the initial stages of production. If these industries then allow for greater productivity growth than traditional sectors and they have spillover effects on the rest of the economy, then such temporary protection should increase incomes in the long run.

Thus, while no sensible economic theory offers the conclusion that autarky is preferable to an open economy, there are studies that argue for potentially positive outcomes from selective temporary protection of specific sectors under the “infant industry” and similar arguments (see Rodríguez and Rodrik 2000, pp. 267–272; O’Rourke 2000 for overviews). Such arguments highlight that specialization on the production of “non-dynamic” (e.g., agricultural) commodities can, despite yielding static welfare maximization, lead to lack of development possibilities. Widening the domestic industrial base and aiding the self-discovery of nontraditional productive activities can lead to the evolution of new, more dynamic comparative advantages, which under direct world market pressure could not be effectively developed. If the resulting economic activities lead to higher economic growth and domestic knowledge development with concurrent spillovers – in the tradition of “new” endogenous growth theory – then temporary protection would be justified for the sake of long-term growth and development. However, as has already been highlighted above, foreign trade can also be a channel for knowledge transfer, and hence, trade barriers would act as barriers to the world technology pool and hence retard domestic productivity growth, so that successful “infant industry” protection would require both a wider growth-promoting macroeconomic environment and minimization of trade policy distortions.

Economic theory has thus been shaped by historical developments, and trade has been central to the development of economies over time and space and is thus a worthy focus of the efforts of cliometricians. In the following, we have surveyed papers from 2008 to early 2014, plus older papers which were particularly relevant, although this is by no means a comprehensive study, and we rely on existing surveys where possible.

In relation to the cliometrics of international trade, we start by assessing the consequences of trade, which, according to standard theory, is directly related to understanding the sources of trade, since the standard textbook comparison of “autarky” and “free trade integration” predicts that adjustments in welfare, productive activity, factor remunerations, etc., will reflect these underlying sources. Hence, there is space for studies trying to assess the effect of trade, besides other “domestic” factors, on economic performance, as well as indirectly through the determinants of the latter (technological progress, technology transfer, institutions, and politics) and changes in the former (such as capital accumulation, natural population growth, and relative remunerations of factors of production), apart from the interplay between factor movements (foreign investments, migration) and trade. In the following, we provide a relatively concise survey focused more on methodology than on findings, since a recent chapter by Meissner (2014) in the Handbook of Economic Growth provides a comprehensive treatment of “Growth from Globalization.”

Turning to the big questions of the effects of international integration, two large questions stand out, to which economic historians have provided quantitative answers for the late nineteenth and early twentieth centuries: Does trade cause economic growth? And were trade and factor mobility substitutes or complements?

Regarding the first question, Irwin and Terviö (2002) use an identification strategy developed by Frankel and Romer (1999) to evaluate the impact of trade openness on growth net of the trade-enhancing effect of economic growth. This method consists of using standard gravity variables (distance, population, area, border, landlocked – see below) in a first stage to create “exogenous” trade shares (aggregating bilateral trade per country) to be regressed onto income levels. Irwin and Terviö find that the coefficient for the trade share in their second-stage regressions for 1913, 1928, and 1938 is always positive but significant in only a few regressions, which might in part be due to small samples of 23–41 observations.

As for the second question, Collins et al. (1999) find that between 1870 and 1940, it is difficult to assess whether trade, capital flows, and international migration were substitutes or complements; although they quite clearly reject that trade and labor mobility were substitutes, for capital flows the findings are more ambiguous between complementarity and substitutability. They also highlight that both trade and migration policy might have influenced the actual historical outcomes. Both papers thus hint at history being richer and more complicated than standard theory might predict.

However, the Heckscher–Ohlin framework of relative factor prices and factor price convergence as a consequence of commodity market integration (see below) to explain the nineteenth-century globalization was behind the hugely successful research program leading to O’Rourke and Williamson’s (1999) seminal monograph on Globalization and History. The underlying papers (O’Rourke and Williamson 1994, 1995, 1997; O’Rourke et al. 1997; O’Rourke 1997) have shown that commodity market integration went along with factor price equalization, especially regarding the ratio of wages to land rents, which increased in labor-abundant, land-scarce Europe but decreased in the land-abundant, labor-scarce New World, thanks to international migration, trade, and investments. Despite some criticism, for example, of the underlying data and interpretation of the Swedish case (Bohlin and Larsson 2007; Prado 2010), this account has become the standard reference in research and teaching of the nineteenth-century globalization.

Another central line of research focuses on the evolution of the early modern Atlantic economy, in which trade was not necessarily positive for welfare and development: Nunn (2008; see also Nunn and Puga 2012) finds that the slave trade had a clearly negative effect on the economic performance of the African regions that were most affected, not so much due to classical “direct” allocation effects, but through the indirect impact via two not necessarily exclusive channels: boosting ethnic fragmentation and debilitating state capacity formation. This, of course, hints at the interplay between trade and domestic institutions and politics, a central topic in recent empirical growth economics. Acemoglu, Johnson, and Robinson (2005) find that, in Western Europe, the “central corner” of the Atlantic triangle, related trade was not large enough to directly boost economic growth significantly via capital accumulation or static gains from trade, but it increased the weight of merchants in political processes and thereby helped to tilt the political equilibrium towards institutional arrangements that favored trade and eventually economic growth via North and Thomas’ (1973, p. 1) “efficient economic organization” via property rights and related “inclusive institutions.”

This literature adds new layers onto an older literature regarding the role of trade in the “Great Divergence” with the “Rise of Western Europe,” on the one hand, and African, Asian, and Latin American “backwardnesses” on the other. The relatively small importance for this trade on the European side has been highlighted by O’Brien (1982) and is mirrored in Acemoglu, Johnson, and Robinson (2005), the O’Rourke and Williamson (2002b) assessment of the sources of early modern trade growth, as well as most recent discussions of the sources of the British Industrial Revolution. The latter often discard an important initial role for trade (Harley 2004; Mokyr 2009; McCloskey 2010), despite updated accounts on the volume and the working of the triangular trade (Inikori 2002) as well as selected links with welfare and economic activity in selected British ports (Draper 2008 for London shipbuilding, Richardson 2005 for Bristol), and for inventions and productivity in certain industries (Zahedieh 2013 for the British copper industry).

In an attempt to quantify the possible welfare losses for Britain from significantly reduced access to international markets, Clark, O’Rourke, and Taylor (2014) show in the context of a static standard computable general equilibrium model that relatively small welfare losses of 3–4 % would have occurred in 1760, while increasing dependency on foreign trade, especially by the rapidly growing textile industry, would have implied substantial static welfare losses of 25–30 % in 1850 by reducing access to foreign endowments and markets substantially. Beyond highlighting the importance of trade for the deployment of the industrial revolution, Allen (2003, 2011) has highlighted that the centrality of Britain in early modern international trade bore an important direct responsibility for the development of energy-intensive, labor-saving innovations that became a central feature of the industrial revolution, by raising real wages and making labor relatively expensive in Britain.

It is not only cliometricians of the British Industrial Revolution who have worked on the causal link between trade and economic performance and the role of international supply and demand versus domestic forces. A variety of studies with different approaches have emerged for mostly “peripheral” players in the emerging international economy of the nineteenth and early twentieth centuries. For Italy, Pistoresi and Rinaldi (2012) use cointegration analysis to assess (Granger-)causal relationships between imports, exports, and GDP. Bajo Rubio (2012) and Guerrero de Lizardi (2006) have conducted similar analyses, explicitly testing for the existence of balance of payments constraints to economic growth in Spain and Mexico respectively, that is, structural limitations to conduct necessary imports for balanced economic growth. Other studies using cointegration analysis of the effect of trade on domestic economic activity include Greasley and Oxley (2009) on the pastoral boom in New Zealand after the invention of refrigerated long-distance transport and Boshoff and Fourie (2010) on the importance of both provisioning for ship traffic around the Cape of Good Hope and travellers stopping there during their journey to the East Indies, an early form of tourism, for agricultural activity in the Cape Colony. Somewhat connected to Allen’s argument, Huff and Angeles (2011) show that globalization had a causal impact on urbanization in Southeast Asia prior to World War I, without leading to industrialization, simply by increasing demand from industrializing markets in the center of the world economy, fomenting commercial production and infrastructure investments, and accompanying overhead services in administrative and commercial centers.

Other authors have used different versions of input–output analysis to assess the relative importance of foreign versus domestic demand and supply forces in structural models: Bohlin (2007) looks at Sweden before World War I, Kauppila (2009) at Finland during the Great Depression, and Taylor, Basu, and McLean (2011) show, using Leontief’s original 1947 input–output table, that (mostly US financed) exports to Europe in the immediate postwar years (1946–1948) helped to avoid increasing US unemployment during the reconversion from a war-oriented to a civilian economy (Leontieff 1953). Ljungberg and Schön’s (2013) comparative assessment of the drivers of industrialization in the Nordic countries shares a similar analytic framework but uses shift-share analysis.

Returning to internationally comparative studies and channels between trade and economic growth, Liu and Meissner (2013) derive a new, theoretically consistent measure of market potential and assess whether differences in domestic and foreign markets contribute to explain productivity differentials between the USA and the other countries on the eve of World War I. They find that productivity/GDP per capita is significantly related to market access but that its substantive significance vis-à-vis other factors is relatively minor. Madsen (2007) has shown that bilateral trade was a decisive channel for technology transfer and hence total factor productivity (TFP) growth and convergence for current OECD countries over the 135 years from 1870 to 2004, thereby extending findings by Coe and Helpman (1995) beyond recent periods. López-Córdova and Meissner (2008) examine the link between trade and democracy, and Huberman and Meissner (2010) show that bilateral trade was a diffusion channel especially for the adoption of basic labor protection legislation, such as factory inspection and minimum work ages for children. Vizcarra (2009) demonstrates how the Peruvian guano boom helped the country to return to international capital markets despite domestic political instability and a history of defaults. This finding seems to suggest that at least some forms of trade, controlled by foreign customers and investors, can be substitutes for “real” political and institutional reforms, a recurrent theme in the literature on modern commodity booms and the “resource curse” in developing countries.

In this context one final strand of literature, related to specialization resulting from international trade, merits attention: the debate on the role of the specialization in primary commodities for the growth perspectives of developing countries. This topic, promoted in economic history by Jeffrey Williamson and coauthors, for example, in his 2011 book on Trade and Poverty (Williamson 2011), has three strands: first, the original Prebisch–Singer finding of falling secular terms of trade for primary commodities that structurally harm the purchasing power of primary producers. Here, one recent comprehensive article by Harvey et al (2010) underlines that over the last four centuries, for 11 out of 25 commodities studied, relative price trends were significantly negative, while for none was a significantly positive trend found, underlining that Prebisch–Singer forces are at work (Prebisch 1950; Singer 1950). A second strand focuses on deindustrialization and losses of dynamic development possibilities resulting from such specialization via Dutch disease forces or because of forces modelled, e.g., in Matsuyama (1992) and the infant industry literature. Hadass and Williamson (2003) and Williamson (2008) offer a comprehensive assessment of the effect of terms of trade on economic performance before World War I. Third, recent literature has highlighted that more than long-run trends in relative prices, the higher volatility of prices for primary products versus manufactures has harmed economic performance and investment, etc., in developing countries (Blattman et al. 2007; Williamson 2008; Jacks et al. 2011b). Country studies, conducted by Williamson and coauthors (e.g., Dobado González et al. 2008 for Mexico, Clingingsmith and Williamson 2008 for India; Pamuk and Williamson 2011 for the Ottoman Empire) and others (Federico and Vasta 2012 for Italy, Beatty 2000 for Mexico), serve to complement these comparative-econometric findings with historical case studies of channels, mechanisms, and their importance relative to domestic forces.

The impact of trade policy (to which we return in the last section) on the economy has been investigated in different frameworks. The first, already mentioned above and discussed in more detail below, is the gravity equation and the question whether tariffs and other trade policy components affect (reduce or divert) imports or exports. In a similar vein, researchers have asked if trade policy affects relative prices and factor incomes and, as exemplified in O’Rourke’s (1997) study of the grain invasion, have found that this is normally the case. These findings imply that trade restriction via trade policy normally works, although trade policy might not translate 1:1 into the desired effects due to varying elasticities of demand and substitution between international and import-competing goods, both on the side of domestic suppliers and the preferences of domestic consumers.

In economic history, several studies since the seminal and controversial contribution of Bairoch (1972) have run growth regressions to estimate the impact of “average tariffs” on growth. The main finding is that of a “tariff–growth paradox” following the widely cited article by O’Rourke (2000) and subsequent papers by Vamvakidis (2002), Clemens and Williamson (2004), and Jacks (2006b). The robustness of these findings has been challenged by results with different methodologies and samples, including Foreman-Peck (1995), Irwin (2002), Athukorala and Chand (2007), Madsen (2009), Tena-Junguito (2010a), Schularick and Solomou (2011), and Lampe and Sharp (2013). Recent research has moved towards a clearer identification of the underlying channels of an existing or nonexisting tariff–growth paradox: Lehmann and O’Rourke (2011) find that before 1914, tariffs on manufactured goods were growth enhancing, while tariffs on agricultural commodities were probably harmful, and revenue tariffs on luxury goods and “exotic” products had no effect on growth. Tena-Junguito (2010a) finds that the skill bias of tariffs, one of the measures developed to assess not the average level, but the structure of tariffs, is significantly related to growth before 1914. Lampe and Sharp (2013) have highlighted that the other side of a potential reverse causality circle is also of interest, since in many countries tariff liberalization was preceded (and “Granger caused”) by higher-income levels, presumably due to their effect on increased fiscal capacity to generate non-customs revenues (see, e.g., Aidt and Jensen 2009).

On a country level, Athukorala and Chand (2007) have studied the tariff–growth relationship for Australia over more than 100 years. Broadberry and Crafts (2010) have surveyed the interplay between trade openness, labor productivity, and structural change in Britain since 1870. Ploeckl (2013) shows that Baden’s adhesion to the German Zollverein in 1836 had “traditional” effects on economic performance via increased market access but also led to the investment of Swiss entrepreneurs in Baden due to the higher external tariff Swiss exports faced towards the new customs area. Kauppila (2008) has studied the impact of tariffs on industrial activity and prices in interwar Finland. Tirado, Pons, Paluzie, and Martínez-Galarraga (2013) combine new economic geography and an assessment of tariffs in their study of the effect of a gradual closing of the Spanish economy between 1914 and 1930 on the evolution of the regional wage structure. In the case of Spain, the post-Civil War (1936–1939) dictatorship under Generalísimo Franco is an especially interesting field of study, since it tried to run the country on an autarky basis. The macroeconomic consequences of this and the stepwise reforms during the 1950s have been ingeniously investigated by Prados de la Escosura, Rosés, and Sanz-Villaroya (2012); Martínez Ruiz (2008) has studied the impact of autarky policy on industrial efficiency (in 1958) via the domestic resource cost (DRC) indicator; and Deu and Llonch (2013) focus on the technological backwardness of the Spanish textile industry as a consequence of closed channels for embodied technology transfer. A related topic is import-substituting industrialization (ISI) in Latin America, whose strategies and results have been systematically investigated in Taylor (1998). Debowicz and Segal (2014) shed new light on the role of ISI for structural change and industrialization in a dynamic computable general equilibrium model for Argentina.

Finally, a few studies have used cliometric methods to study the effect of specific tariffs on the emergence of individual industries. The classical studies in this case are Head’s (1994) study of the protection of US steel rails and Irwin’s (2000) assessment of the US tinplate industry, which contrary to other iron and steel products faced a rather low tariff due to a misplaced comma in the 1864 tariff law. More recently, Inwood and Keay (2013) have studied the role of trade policy in modernizing and expanding the Canadian iron and steel industry in a comprehensive design including a novel identification strategy. Finally, Henriksen, Lampe, and Sharp (2012) demonstrate the relevance of the cheese tariff for the profitability of the Danish dairy industry before its eventual takeoff after 1880.

Having established the importance of trade in an historical context, we proceed by dividing this chapter into three further sections, which might be considered to follow a reverse causal structure. Thus, in the next section, we consider the extent of trade over time and space. How do we measure it? What different trade regimes can we identify in history? This, of course, can differ over both time and in the cross section and can be considered both in terms of trade volumes and in terms of market integration, which is measured by looking at prices in different markets. It also connects to the literature on the historical extent of “globalization.” The section “What Determines Trade?” goes back one stage further and asks what is behind these different regimes, for example, institutions, technology, and trade policy. The latter deserves a particular mention given its importance for the pattern and extent of trade, as well as its central role, particularly in history, for the economic debate. Especially in the nineteenth century, politicians believed that by regulating trade they were managing their whole economies. We thus devote the section “And What About Trade Policy?” to the issue of how to measure trade policy and its determinants.

Measuring the Extent of Trade and Market Integration

Before we can examine the effects of trade as discussed above, we need to be able to measure it. Thus, in this section we discuss the measurement of trade and market integration.Footnote 1 Clearly, the most direct way to measure the extent of trade is to look at the historical records of trade flows, which were often compiled by the customs authorities. Alternatively, or as a complement to this, cliometricians often measure the extent of market integration, which relies on price information.

In very general terms, cliometricians have argued that the extent of market integration should be measured in terms of adherence to the (transaction cost adjusted) law of one priceFootnote 2, i.e., that integrated markets should enjoy an arbitrage-induced equilibrium, whereby prices cannot vary by more than the transaction costs of trading between them. Since market integration should be accompanied by more trade because of lower transaction costs, it should also lead to the effects outlined in the previous section.

The related work on globalization – a major part of the market integration literature – was inspired particularly by the new globalization of the late twentieth century, and the interest of cliometricians soon focused on the late nineteenth century, which they termed the “First Era of Globalization” (much of the early literature is summarized by O’Rourke and Williamson 1999). Exactly how to define globalization was, and is, a moot point. Clearly it should at least involve intercontinental trade, but the work by O’Rourke and Williamson cited in the introduction emphasized in particular that increasing volumes of trade were not a sufficient criterion for implying the presence of globalization – after all, intercontinental trade had expanded in previous eras, particularly perhaps with the European “discovery” of the Americas. Nor should it be defined by low-volume, high-price products such as the famous spices from the East, which have been traded for centuries. Instead it should be about the market integration of important, but basic, commodities, such as grains. Thus, in this literature, market integration was taken as an indicator of the increasing interdependence of markets and thus also their “globalization,” and globalization is thus simply market integration on a global scale.Footnote 3

To measure the extent of market integration, we simply need prices from different markets. The extent of trade and market integration is clearly linked, although markets might appear integrated even without trade, and there can be large volumes of trade with little market integration, as we discuss below. An important aspect of this is that trade regimes do not simply vary across time, for example, in the sense that the interwar years were more protectionist and with lower levels of trade and less market integration than the late nineteenth century. They also vary across space, so that, for example, Britain and Denmark were more free trading and consequently more internationally integrated in the late nineteenth century than France, the USA, and Sweden, for example. The market integration literature is heavily biased towards an understanding of the time dimension in the sense that many studies look at country pairs, or averages of several countries, and ask whether market integration is increasing or decreasing over time.

Turning first to the measurement of trade, much of the historical metrics have concentrated on tasks prior to the analysis of trade flows and their consequences, that is, the construction of databases and the examination of reliability and usefulness of key sources on cross-border trade. Starting with the most complex task, measuring the growth and geographical composition of world trade in the period prior to international statistical bodies like UN, IMF, and World Bank and their classification (such as the Standard International Trade Classification) has been undertaken by a series of scholars, with the most recent estimates coming from Klasing and Milionis (2014) and Federico and Tena-Junguito (2013).

Klasing and Milionis (2014) calculate a world degree of openness (the ratio of imports and/or exports to GDP) for 1870–1949, which can then be chained with series from other sources such as the Penn World Tables. They contribute little to the understanding of the evolution of trade volumes, since they are aggregating available data from the Correlates of War database built by political scientists (Barbieri et al. 2009; Barbieri and Keshk 2012). Nevertheless, they provide a valuable service as they aim to derive non-PPP-adjusted estimates of national GDPs comparable to the non-PPP US-dollar-denominated trade flows they use; that is, they aim to undo Maddison’s (2001) PPP adjustment based on a shortcut method for deriving the relationship of the difference of national to US price levels from a structural equation inspired by Prados de la Escosura (2000).

On the other hand, Federico and Tena-Junguito (2013) actually revise the whole literature on international trade flows from the beginning and succeed in the construction of comparable series from at least 1850 to 1938 based on a broad base of the cliometric literature and a more comprehensive use of historical statistical material. They also estimate world levels of (export) openness, using national export price indices to deflate trade series to make them comparable to Maddison’s GDP series. Their work also gives a more detailed overview of previous estimates and yields annual growth rates of world trade and trade for the major regions from 1815 to 1938. In addition, they provide a large variety of price series and estimates of average transaction costs derived from CIF-FOB differences, which they show to be fairly constant over time (at about 7 % of commodity values), apparently due to an increase in the average distance commodities travelled as a consequence of falling transport costs for given distances.

Such efforts are built upon two interrelated traditions: one of aggregating national statistics (Bairoch 1973, 1974, 1976; Maddison 1962; Lewis 1981) and the other, more relevant in the present context, of understanding the shortcomings and peculiarities of trade statistics as sources that economists often tend to brush over, while historians may in contrast have exaggerated (Platt 1971; Don 1968). Investigations of national cases like the Netherlands (Lindblad and van Zanden 1989), Belgium (Horlings 2002), Spain (Tena-Junguito 1995), Italy (Tena-Junguito 1989; Federico et al. 2012), China (Keller et al. 2011), and Argentina (Tena-Junguito and Willebald 2013) in the nineteenth and early twentieth centuries have unearthed a variety of peculiarities, most notably (Lampe 2008) underreporting due to smuggling or lack of legal requirement to declare, for example, duty-free imports or exports; differences in the definition, especially in differentiating retained (“special”) imports and exports of domestic production from transit and reexport; unreliable practices of gathering values or converting collected data on quantities into values; and different practices in recording countries of origin and destination, often proxied by last land border or port of consignment, as well as problems with port city entrepôts such as Hamburg for Germany or Hong Kong for China. For a comparative account on the international comparability of origins and destinations, the pioneering study is by Morgenstern (1963) for the first half of the twentieth century, reexamined later by Federico and Tena (1991) as well as Carreras-Marín (2012), Folchi and Rubio (2012), and Carreras-Marín and Badia-Miró (2008) for subsets of countries and commodities over the same period. Lampe (2008) offers a similar investigation for six European countries and the USA in the 1850s–1870s.

For the period prior to the nineteenth century, the problems are even greater since data on port entries, shipment manifests, customs revenues, etc., were in many cases not aggregated at a national level. As a result, they are often difficult to interpret and integrate into a meaningful picture. This leads to generally more qualitative than cliometric accounts, though national experiences and the relative endurance of their researchers provide differences in the state of knowledge.Footnote 4 Recently, sophisticated descriptions of international trade flows and shifting comparative advantages for individual countries have received renewed input through studies on Italy (Vasta 2010; Federico and Wolf 2013) and China (Keller et al. 2011), a study that also assesses changes to the intensive and extensive margin (number of available products and product varieties) over time.

Finally, cliometricians are also now discovering the post-1945 period, where international statistics are easier to collect, and comparative accounts for countries and sectors can be more readily constructed. Examples for this include Serrano and Pinilla (2011) and Hora (2012).

Turning now to market integration, relatively little has been written on the general accuracy and usability of available price series. Although the issue is sometimes discussed in individual works, more often than not cliometricians work with “whatever they can get.” A couple of useful studies by Brunt and Cannon (2013, 2014) have adopted a more critical stance, however. In the first, they offer a careful evaluation of the so-called Gazette prices of grain in England, which have been used in a vast number of studies. They find them to be of generally high quality, but they identify a number of limitations as a general indicator of the levels of prices due to fluctuations in quality, changes in the consumption share of domestic grains, and changes in the definition of the units of observation. In their second study, Brunt and Cannon build on this in order to examine the biases introduced to market integration studies when not taking the weaknesses of the statistics into account. In particular, this problem arises from using infrequent data to measure the half-lives of price shocks, as we will touch on in the following discussion.

The literature on market integration and how to measure it is vast, and it is difficult to improve on the excellent survey provided by Federico (2012a). The following draws heavily on this. His survey includes everything written on market integration, including working papers, before 31 December 2009, and the reader is referred to this for a more complete survey of the literature prior to this date. Thus, we now summarize this literature and its conclusions but update it with the contributions of the last 5 years.

Within the market integration literature, a multitude of methodologies has been used to provide an econometric estimate of the extent of market integration. Likewise, conclusions differ about the extent of market integration, and a perennial question concerns that of “when globalization began.” We start with the methodological debate. One of the main points Federico makes regarding this is that in order to understand market integration, there must be a clear theoretical framework. In particular, it should be understood that it consists of two, separable aspectsFootnote 5: first, that the equilibrium level of prices should be identical (the law of one price) and second, that prices should rapidly return to this equilibrium after a shock (what he terms “efficiency”).

Testing the first condition leads to the obvious problem that it is rarely if ever met in practice due to imperfect markets and the presence of transportation and other transaction costs. O’Rourke and Williamson (2004) suggest that the best approach is to look at trends and see whether or not prices are converging over time. However, although this works well for two markets, it becomes rather more complicated as the number of markets increases, and for this reason most cliometricians have concentrated on price convergence between two markets. Thus, authors such as Persson (2004), Metzler (1974) and O’Rourke and Williamson (1994) have looked at simple graphs or have estimated simple regressions of price gaps or relative prices on trends.

Federico’s preferred method, since it allows for the aggregation of price information from a number of markets simultaneously, is to calculate coefficients of variation and to regress these on a trend: a negative and significant coefficient implies integration (σ-convergence). The contribution of groups of markets to changes in dispersion can be calculated using simple variance analysis (Federico 2011; Sharp and Weisdorf 2013). Federico (2012a) notes, however, that inferences on the extent of market integration based solely on prices is risky, except with the addition of other information, particularly on the existence of trade. This is because a decline in the price gap might reflect a decline in transaction costs between the two locations, but it might also (or instead) reflect an increase in efficiency or availability of information, or it might reveal indirect arbitrage via other markets between which transaction costs have fallen.

Tests of “efficiency,” i.e., the strength of arbitrage forces, on the other hand, follow a number of approaches, each of which also has particular weaknesses. First, cointegration implies that the price differential will return to equilibrium after a shock due to arbitrage. Using the Vector Error Correction Mechanism (VECM), it is possible to both test for the presence of a cointegrating relationship and to estimate the half-life of a shock (see, e.g., Ejrnæs et al 2008). As Taylor (2001) explains, however, this can lead to an overestimation of the size of the correction as long as transaction costs are positive. Thus, alternative approaches such as the threshold autoregressive (TAR) model have been suggested, which implies that prices only converge up to the “commodity points,” i.e., the difference in prices beyond which arbitrage becomes profitable after the payment of transaction costs.Footnote 6

The second approach, co-movement, implies that prices move together due to arbitrage. In its simplest form, this corresponds to the calculation of the coefficient of correlation between two prices or an OLS regression between them. To avoid bias if these prices share a common trend, the data can be de-trended, for example, by first differencing.Footnote 7 More recently, a Bayesian approach has also been applied (Uebele 2011). Third, variance tests can reveal that arbitrage has reduced the effects of local shocks, thus decreasing the volatility of prices,Footnote 8 although as Federico notes, such declines in variation could also be the result of changes to the weather or technology, for example.

Besides their weaknesses as discussed above, Federico is pessimistic about all these measures of efficiency, since they provide no indication of how to determine the relative strength of market integration (e.g., how close should the correlation between prices be before we claim “strong” integration?). Moreover, successful inference requires that it is possible to distinguish trading and non-trading locations, so we must be certain that common shocks unrelated to arbitrage are not biasing integration measures upwards and that models that assume constant parameters (often over very long periods) are well specified. Moreover, it is not clear how the results for several country pairs, e.g., the correlation coefficients of their prices, can be aggregated into a more general and coherent picture. Other difficulties Federico notes are with the available data, which are often too infrequent to measure the speed of adjustment satisfactorily and only available for certain, possibly nonrepresentative, commodities (often grains), a point taken up again more recently by Brunt and Cannon (2014), who also measure the extent of the bias using data for England.

In the following we abstract from the more technical debate about how to test for market integration, and what exactly it means, and summarize some of the most important results from the literature. Federico (2012a) notes that most papers testing for market integration cover relatively short time periods and that there is a preponderance of work on the long nineteenth century, i.e., from the Napoleonic Wars to World War I. He explains that the results can be summed up quite simply. First, before the early modern period, there were waves of integration and disintegration both within Europe and between continents. Second, integration increased in the first half of the nineteenth century, but the process was slowed by increasing protectionism towards the end of the century, culminating in the well-known market disintegration of the interwar years. As Federico (2012a) also noted, the literature on the interwar market integration is perhaps surprisingly thin.Footnote 9

Unfortunately, this generalization masks some debates. For example, although O’Rourke and Williamson (2002a) argue that there was no transatlantic integration in the early modern period, Rönnbäck (2009) sees waves of integration and disintegration, with great variation depending on which routes and commodities are being studied. Jacks (2005) was the first to suggest that markets started to integrate before the mid-nineteenth century. This is supported for the classic example of the trade between North America and Britain by Sharp and Weisdorf (2013), who document evidence for the importance of imports of wheat from the USA to Britain already in the middle of the eighteenth century, but with market integration being continuously disrupted, in particular by the French and Napoleonic Wars.Footnote 10 Similarly, but looking more generally at Europe and the Americas, Dobado-González et al. (2012), using a new methodologyFootnote 11 to test for grain market integration between Europe and the Americas over the eighteenth and nineteenth centuries, find gradual integration with some setbacks. Going back further, more recent work by O’Rourke and Williamson (2009) demonstrates that the European Voyages of Discovery of the fifteenth and sixteenth centuries led to the integration of both European spice markets with those of Asia (despite the attempt to monopolize spice markets), as well as those within Europe. They would not, of course, classify this as evidence of globalization.

A similar debate exists for market integration within Europe, with Özcumur and Pamuk (2007) arguing against integration before the nineteenth century and Persson (1999) arguing for grain market integration across Europe already in the eighteenth century. More recent work by Bateman (2011) suggests that markets were as integrated in the early sixteenth as in the late eighteenth century, but with a severe contraction in between, while Chilosi et al. (2013) use a large database on grain prices for 100 European cities to demonstrate that market integration was gradual and stepwise rather than sudden for the period 1620 until World War I.Footnote 12

What Determines Trade?

Trade theory , as outlined briefly above, provides the framework within which economists and cliometricians can understand the reasons for the patterns of trade which they observe. Direct tests of trade theory are, however, rare and often inconclusive, not just in a historical perspective but also for more recent periods. Estevadeordal and Taylor (2002) provide a series of tests of the Heckscher–Ohlin–Vanek theory of trade, that is, whether predicted and observed factor contents of trade for 18 countries, disaggregated by industry, correlated in 1913. For the standard factors of production, capital, and labor, correlations between predicted and observed factor contents are low, while for (especially nonrenewable) natural resources their findings show that factor abundance and observed trade patterns seem to fit quite well.

A similarly motivated literature examines whether the factor endowment theory in its price version holds, that is, whether “autarky prices” of goods whose production use a relatively abundant factor are relatively cheap. Normally, autarky prices cannot be observed, so the literature focuses on whether market integration, that is, a reduction in barriers to trade, leads to commodity and factor price convergence following Heckscher–Ohlin arguments. The main exponent of this literature is O’Rourke and Williamson’s (1999) Globalization and History and its background papers. However, Bernhofen and Brown (2004, 2005, 2011) have used the actual opening of the isolated Japanese economy after 1853/1857 and its abundant available data for a direct evaluation of the autarky prices of its revealed exports after opening, finding that Heckscher–Ohlin type predictions cannot be rejected or are confirmed by this natural experiment.

Beyond this more or less strictly Heckscher–Ohlin-oriented literature, researchers trying to explain the growth of trade have used empirically less restrictive designs, mostly based on the gravity model, both to explain the growth of world trade in specific periods and when inferring determinants of trade from the immense variation to be obtained from comparing bilateral trade flows in cross section or panel designs. The gravity model departs from a simple but theoretically micro-founded idea borrowed from Newtonian physics: the size of trade flows between two countries is (log) proportional to the size of their respective economies and the economic (geographical, institutional, cultural) distance that separates them. However, theoretical motivations and econometric applications have shown that the simple, “naïve” gravity equation, following Head and Mayer (2013, Eq. 4, p. 12)

$$ {X}_{ni}=G{Y}_i^a{Y}_n^b{\phi}_{ni} $$
(1)

where the Y  ’s are importer and exporter GDPs, ϕ is distance, and G is a gravitational (cross-sectional) constant, has important flaws. Based on arguments prominently brought forward first by Anderson and van Wincoop (2003), empirical trade economists now recommend including proxies for the so-called multilateral resistance, that is, country-specific characteristics related to the idea of a “home bias” that make them more or less reluctant to trade internationally. Since these are normally assumed to be time varying, the typical approach is then to include country-year fixed effects, which, however, eliminates any other variable from the regression that is determined annually on the country level – such as GDP, GDP per capita, etc.

Thus, Estevadeordal et al. (2003) have used the gravity equation to assess the drivers behind the “Rise and Fall of World Trade” by first estimating gravity models including transport costs, tariffs, and the currency arrangement of the gold standard and then using the estimate to calibrate counterfactual situations for 1870, 1900, 1929, and 1938, in which these variables take their 1913 values. They find that world trade in 1870 would have been five times larger, and world openness (trade/GDP) doubles the actual value. The higher counterfactual versus actual openness would be explained mostly by the spread of the gold standard and lower transport costs, as well as some income convergence, especially before 1900, while tariff changes played no role. The almost 60 % higher counterfactual trade and 141 % higher counterfactual openness in 1939 estimated by Estevadeordal et al. would have been achieved by avoiding increasing transport costs in the interwar period, maintaining the gold standard at its 1913 level and avoiding the increases in tariffs that followed, especially after 1929. Some of these results have been reexamined in subsequent studies focusing on individual trade determinants, such as Jacks and Pendakur (2010), surveyed below.

O’Rourke and Williamson (2002b) provide a similar assessment of the drivers of a 1.1 % annual growth rate in Europe’s intercontinental trade between 1500 and 1800, but have to rely on much scarcer data, combining information on quantities and price gaps. They conclude that between half and two thirds of the post-Columbus trade boom is not explained by decreasing transport costs – which they find to be unstable and negligible due to “monopoly, international conflict, piracy, and government restrictions” (p. 426) – but by increases in European surplus income (i.e., land rent growth) spent on “exotic” commodities. This gave rise to a number of papers discussing “When did globalization begin?” which we survey in the context of the price-based market integration literature below.

For the period from about 1850 to 1940, as well as subperiods motivated by the research question of each study, researchers have used data on trade volumes in the context of the gravity model to investigate the significance and importance of different determinants of trade flows. The following offers a short survey of this literature. Although all gravity models include some proxy for country size (GDP) or productivity/purchasing power (GDP per capita), apart from Estevadeordal et al., the focus of the gravity-based literature is not directly on income growth or convergence as the main determinants of bilateral trade performance.

Distance, by contrast, has attracted considerable attention, especially since the now classical account of the late nineteenth-century globalization. O’Rourke and Williamson (1999) give (exogenous) innovations in transport technology, such as railways and steamships, as the main drivers of market integration during this period. The easiest way of incorporating distance, as done by Estevadeordal et al., is to calculate “effective distance” by multiplying geographic distance with a transport cost factor, traditionally taken from Isserlis’ (1938) maritime freight rate index and improved by Mohammed and Williamson (2004). This, however, assumes homogeneity of trade cost developments across routes or actual mode of transportation. Jacks and Pendakur (2010) use more refined data on transport costs by different routes and plausible instrumental variables to argue that it was not transport cost reductions which caused trade to increase but that increased bilateral trade led to increased demand and lower costs for transport services between 1870 and 1913. They then recalculate the sources of trade growth over this period, attributing 76 % of it to income growth, 18 % to income convergence, and relatively small shares to the gold standard (6 %) and declining exchange rate volatility (2 %), while the mild increases in average tariffs over the period would have contributed negatively (−1.4 %).

However, in subsequent research, Jacks and coauthors (2008, 2010, 2011a) have derived a gravity-based measure of trade costs, which theoretically include all costs of conducting international trade as compared to national trade, that is, all determinants of bilateral trade increases not corresponding to income growth. They show that these costs vary significantly between country pairs and for the average of trading partners of individual countries, as well as over time; they are also significantly higher than existing ad valorem freight rate estimates for corresponding connections. For the period 1870–1913, they declined on average by 33 %, increased (with considerable fluctuations) by 13 % between 1921 and 1939, and decreased by 16 % between 1950 and 2000 (Jacks et al. 2011a, pp. 190–192).

When estimating the determinants of these trade costs, distance, tariffs, the gold standard, the British empire, and joint railway density turn out to be significant determinants in the 1870–1913 period (Jacks et al. 2010, p. 135) as well as wider measures of fixed exchange rate regimes, common language, empire membership, and shared borders for all three periods (Jacks et al. 2011a, p. 194). Of the 486 % growth in world trade between 1870 and 1913, 290 % can be explained by the fall in trade costs and the rest mostly by increased output. For the period 1921–1939, they find a 0 % increase in world trade, to which an increase in trade costs that would have led to a trade decline by 87 % contributed negatively, while an almost equal contribution of income growth nullifies this (Jacks et al. 2011a, p. 195; cf. Jacks et al. 2008, p. 534). The Jacks-Meissner-Novy trade cost measure cannot be used as a measure of economic distance in gravity equations, since it is calculated based on the gravity equation itself. Assessing the importance of its components for systematic changes in trade would therefore imply first calculating the trade cost measure and its quantitative importance for trade and then estimating the determinants of trade costs and proceed from there to indirectly identify their effect on trade. So far, the literature in this direction has not extended beyond the initial contributions described here.

Researchers have, however, estimated the effects of all sorts of trade cost-related determinants of bilateral trade flows in the gravity framework. Related to transport and transaction costs, this includes physical transport infrastructure (railway mileage/density, e.g., in Lew and Cater 2006; Mitchener and Weidenmier 2008) and communication infrastructure to facilitate information flows and shipping coordination (telegraphs as proxied by the bilateral sum of telegrams sent in Lew and Cater 2006). To date nobody has included costs of information transmission or actual volumes of international traffic or information flows, although both, in the sense of Jacks and Pendakur (2010), might be endogenous to trade flows.

The role of exchange rate regimes, especially the gold standard, has also been central to the debate, given its prominence in accounts of both pre-World War I globalization and post-World War I instability and the Great Depression. For the first period, López-Córdova and Meissner (2003) find that the gold standard had considerable trade-enhancing effects: countries on the gold standard traded “up to 30 % more with each other than with countries not on gold,” so that, had the gold standard not spread widely, world trade in 1913 would have been approximately 20 % below its actual level. In a similar fashion, Flandreau (2000), in what seems to have been the first cliometric gravity paper, and Flandreau and Morel (2005) assess the impact of the Scandinavian and Latin Monetary Unions and the Austro–Hungarian currency union on trade flows, finding insignificant effects for the Latin Monetary Union, but a significantly positive contribution of the apparently more tightly coordinated currency unions in Austria–Hungary and Scandinavia on trade flows.

For the interwar period, the formation of trade and currency blocs has been analyzed with special care. Eichengreen and Irwin (1995) found that members of the Commonwealth [Ottawa signatories] and the Reichsmark bloc already traded more with each other in 1928, that is before they formed “blocs” as a consequence of the Great Depression. Ritschl and Wolf (2011) have reassessed the issue more formally, modelling endogeneity based on optimum currency area arguments. They essentially confirm that naïvely estimated trade creation among members of the different blocs disappears when accounting for the countries’ self-selection into these blocs. Political scientists Gowa and Hicks (2013) have recently revisited the issue with a larger dataset. They confirm that none of the blocs increased trade between their members as a whole and underline political conflict and cooperation between the great powers (and “anchors” of the 1930s blocs) as an important component for understanding interwar trade patterns.

Recently, Eichengreen and Irwin (2010) have shown that, at least in the 1930s, flexible monetary policy and trade restrictions were substitutes, with trade restrictions being used when monetary policy, e.g., under the “straitjacket” of the gold standard, is limited when addressing domestic concerns. This leads us to the next classical determinant of foreign trade: trade policy. Studies have investigated two strands, tariffs (normally proxied by the average ad valorem tariff discussed below) and the effects of trade agreements, proxied by dummy variables. For the former, studies are limited, although Lampe (2008, p. 124–125), Flandreau and Maurel (2005, p. 139), and Estevadeordal et al (2003, p. 374) find indications of a significantly negative relationship before World War I. For the same period, Jacks (2006b, p. 220) shows that both levels and changes of tariffs are positively correlated to a positive balance of payments (scaled to GDP), while Madsen (2001) finds a significantly negative impact of tariffs on trade in the interwar period. Regarding trade agreements, both the benign bilateralism of the mid- to late nineteenth century and the pernicious bilateralism of the interwar period have been evaluated using gravity models.

For the nineteenth-century most-favored nation clause trade agreements, both Accominotti and Flandreau (2008, period 1850–1880) and López-Córdova and Meissner (2003, period 1870–1913) find insignificant coefficients, with the former concluding that seeing the Cobden–Chevalier treaty of 1860 as a cornerstone of the nineteenth-century globalization would therefore be unjustified. Lampe (2009) has reexamined the evidence at the commodity level, arguing that nineteenth-century bilateralism did not actually intend to increase world trade, but to exchange preference for specific commodities, for which he does find commodity-specific trade-enhancing effects for the first wave of the European Cobden–Chevalier network (1860–1875).

For the interwar period, apart from the literature cited above, Jacks’ (2014) study of the effects of the imperial preference system resulting from the 1932 Ottawa Agreements on Canadian trade patterns at the commodity level merits attention. He uses a difference-in-difference approach on trade flows at a quarterly frequency and shows that the Imperial Economic Conference had substantial anticipation effects on Canadian trade with the other signatories but very unclear direct effects once it was in place, leading him to conclude that “the conference was a failure from the Canadian perspective.” In contrast, Gowa and Hicks (2013) find that while the Imperial Preference System does not seem to have increased or redirected trade among members significantly, the trade of the UK within the system seems to have been redirected towards the preference group.

Another potentially transaction cost-reducing military–politico–economic institution, somewhat related to the interwar trade blocs discussed above, is colonialism, which, due to common economic and legal frameworks, bureaucratic practices, and preferential market access and potentially due to emigration, settlement, and homogeneous culture, might be trade enhancing. Mitchener and Weidenmier (2008) have examined the trade-enhancing consequences of colonial relationships using a large bilateral trade flow dataset for the 1870–1913 period (more than 20,000 observations) and find that empire membership had significantly positive effects on trade, with trade more than doubling between empire members as opposed to nonmembers. These were apparently largest for the relatively small empires of the USA and Spain but also substantial for the British, French, and German colonial empires. In a second step, they reestimate their models with a set of transaction cost (common language, years in empire, imperial currency union) and trade policy-related (empire customs unions and preferential market access proxies) variables and show that all of them are significant determinants of trade, confirming the trade cost decreasing function of empires. Head, Mayer, and Ries (2010) have shown that these tend to persist even after independence but decrease over time, probably because of depreciating “trading capital.”

Another form of changing political ties is the redrawing of national borders. The Versailles settlement after World War I provides a quasi-natural experiment, especially for parts of prewar Germany, the dissolution of the Habsburg Empire, and the independence of Czechoslovakia, Hungary, and Poland, and the formation of Yugoslavia. Border effects are normally estimated from price data, but in a series of papers, Schulze, Wolf, and coauthors (Trenkler and Wolf 2005; Wolf 2005, 2009; Heinemeyer 2007; Schulze and Wolf 2009, 2012; Schulze et al. 2008, 2011) have estimated the effects of old and new borders on new and old political entities using trade statistics on railway shipments between regions and across old and new borders. Two central findings are that borders both tend to be endogenous and their effects persistent over time and here, ethno-linguistic composition, that is, cultural ties, seems to play an important role for explaining trade flows (Schulze and Wolf 2009; see also Lameli et al. 2014).

Conflicts and military alliances have also been shown to be important determinants of trade flows. Gowa and Hicks (2013) highlight the importance of certain military alliances in the interwar period, while Rahman (2010) assesses the effects of being allied to central naval powers between 1710 and 1938. Glick and Taylor (2010) deal with the relationship between trade and wars and show that wars have a significantly negative impact on trade up to 8 years after they were fought and influence not just trade between opposed parties but also their trade with third countries. They use their results to quantify the trade loss as a share of world GDP resulting from World War I and World War II at 10 % and 17.6 % of the respective prewar GDPs, with a corresponding trade-related GDP loss of 4.4 % and 4.2 %, respectively.

Related to this, some studies have also shown that democratic countries trade more with each other (Gowa and Hicks 2013). The importance of national institutional factors for trade orientation has also been stressed in papers with methodologies different from the gravity equation: Sánchez et al. (2010) have shown that lower levels of land conflicts and more secure land property rights helped raise investment in export-oriented coffee trees and production of coffee in the nineteenth- and early twentieth-century Colombia. Rei (2011) examines the determinants of institutional choices that determined the performance of early modern merchant empires in the long run.

What does the market integration literature contribute to this literature? Clearly, many of the factors identified as being determinants of trade, such as trade policy and wars, will also impact on market integration. Following Harley (1980), O’Rourke and Williamson (1999) are particularly associated with the idea that it was falling transatlantic transport costs which led to the globalization of the late nineteenth century, although Persson (2004) and Federico and Persson (2007) argue that it was largely domestic American transport costs that fell, particularly with the extension of the rail network, rather than transatlantic shipping costs. Their basis for so doing is the calculation of “freight factors,” i.e., the cost of shipping a unit of a good divided by the price of the good. This can be considered as an ad valorem measure of shipping costs, equivalent to ad valorem measures of tariffs (see below), and a more accurate indicator of the impact of shipping costs on market integration than standard indicators of real freight rates.

Beyond transport costs, the market integration literature has largely focused on demonstrating the fact that markets integrated and disintegrated, rather than testing and estimating the factors behind this, although reasons are usually suggested. For example, O’Rourke (2006) demonstrates that mercantilist conflicts restricted commodity market integration in the eighteenth century, and Sharp and Weisdorf (2013) identify trade policy, war, and politics as being behind the fluctuating experience of market integration between America and Britain in the eighteenth and nineteenth centuries before the revolutionary changes in transport technology, which to a large part has inspired the nineteenth-century globalization literature. At the other end of the First Era of Globalization, Hynes et al (2012) show that the disintegration after 1929 was caused by trade barriers, the collapse of the gold standard, and the difficulty of obtaining credit.

A particularly notable contribution to this debate is Jacks (2006a), who directly focuses on the question of what drove commodity market integration in the nineteenth century. Using an impressively large panel of grain prices, he finds econometric evidence for the importance of transport technology, geography, monetary regimes, commercial networks/policy, and conflict over both the cross-sectional and temporal dimensions. In more recent work, Ejrnæs and Persson (2010) have demonstrated the improvements in market efficiency between Chicago and Liverpool after the establishment of the transatlantic telegraph due to faster arbitrage (efficiency) and quantify the gains in terms of reduced deadweight losses. Finally, using data from the transatlantic slave trade, Rönnbäck (2012) suggests that some of the market integration in the early modern period was due to the increased transit speed of ships.

And What About Trade Policy?

As mentioned in the first section, a key feature of trade in economic history and modern economics is the existence of policy barriers to trade. In principle, trade policy is any policy that affects the volume and value of imports coming into or exports leaving a country. This can be by levying tariff duties and other commodity-specific taxes, which, if not corresponding to exactly equivalent domestic taxes, will introduce changes in the relative prices between imported and domestically produced goods and probably also between the relative prices of different sorts of goods, depending on the rates of these duties and the elasticities of demand, supply, and substitution. Ideally, in order to study trade policy, we would wish to create an aggregate measure of all the various forms of duties, as well as accompanying legislation on related trade costs, such as monopolies, port duties, river and strait/sound tolls, prohibitions, regulations, etc. This is, however, theoretically difficult and practically impossible with the existing historical data.

Most studies thus proxy trade restrictiveness by the so-called “average ad valorem equivalent tariff rate” (AVE) , which, as the name suggests, should proxy for the average ad valorem duty corresponding to the wide range of weight- or volume-specific rates and other duties importers or exporters would have to pay at the toll house or the customs office. In practice, this is normally estimated as the ratio of customs receipts to total imports, whenever possible separating import from export duty receipts. Among economic historians, this measure has received wide criticism on several accounts. First, it does not account for nontariff barriers, that is, prohibitions or restrictions like quotas or red-tape requirements that discourage trade. Second, it effectively weights rates for individual commodities by their share of imports, which would be affected by the structure of tariff rates if this is not perfectly balanced out to be non-distortionary (Estevadeordal 1997, pp. 91–93). Third, it does not distinguish between protective tariffs, which effectively distort the domestic-to-world market price relationship, and the so-called fiscal tariffs , levied on demand-inelastic goods, and often those which are not produced domestically as an easy way to collect an indirect tax on the consumption of “luxury goods.” This final point is particularly important in the nineteenth century, when large parts of government revenue in many countries are raised from such import duties (Tena-Junguito 2006a, 2010a), although the solution is not obvious, since the “fiscal commodities” taxed in this way should have had some domestically produced substitute and hence fiscal duties would distort prices in favor of the producers of those substitutes.

In practice, the wide use of AVEs is generally justified for a couple of reasons (see, e.g., Eichengreen and Irwin 2010, pp. 881–882; Lampe and Sharp 2013). First, given the data constraints it is extremely difficult to imagine how superior measures might be calculated. Second, AVEs have been shown to correlate significantly with theoretically more consistent measures, both within one country (the USA over the nineteenth to mid-twentieth centuries (Irwin 2010)) and among a wide cross section of countries in the present (Kee et al. 2008). For researchers interested in using AVEs, the standard databases are those underlying Clemens and Williamson (2004), Schularick and Solomou (2011), and Lampe and Sharp (2013).

Alternative measures do exist, however. These are constructed to be more theoretically consistent and have been calculated for certain countries and periods. They include the so-called effective protection rates (Balassa 1965), trade restrictiveness indices (Anderson and Neary 2005), the nominal rate of assistance (Anderson et al. 2008), and Leamer’s (1988) trade intensity ratio.

Effective protection rates combine information on tariffs for individual goods with input–output tables to assess the structure of protection between final products, primary materials, and intermediate inputs and weigh these rates accordingly in an overall index. Federico and Tena (1998, 1999) and Tena-Junguito (2006b, 2010b) have calculated effective protection rates for Italy and Spain in selected years between the 1870s and the 1930s based on individual tariff rates for 400–500 commodities and different input–output-tables. Bohlin (2005, 2009) has undertaken similar work for Sweden.

The trade restrictiveness index (TRI) by Anderson and Neary (2005) in its simplified Feenstra (1995) and Kee, Nicita, and Olarreaga (2009) version is motivated by a computable general equilibrium framework and combines data on tariffs of individual commodities and import demand elasticities, thereby establishing a uniform ad valorem tariff rate calculation equivalent to the same welfare level as the existing structure of varying tariff rates; it can be converted straightforwardly into GDP-share equivalent static deadweight losses (DWL) from protection. Irwin (2010) and Beaulieu and Cherniwchan (2014) have calculated TRIs and estimated DWLs for the USA and Canada over long periods since the mid-nineteenth century. Irwin (2005, 2007) developed a similar measure based on price data to assess the DWL of the Jeffersonian trade embargo of 1807–1809 (about 5 % of US 1807 GDP) and the intersectoral transfers resulting from high tariffs in the USA in the late nineteenth century, for example, the classical transfer from consumers to producers via higher prices for import-competing goods.

Similar considerations are behind the “nominal rate of assistance,” developed mainly to assess the degree of agricultural protection as “the percentage share by which government policies have raised (or lowered) gross returns of producers above what these returns would have been without the government’s intervention” (Swinnen 2009, p. 1501) by comparing domestic to world market prices for individual goods, adding, if necessary, domestic subsidies to the calculations. Swinnen (2009) has calculated these for a variety of agriculture and animal husbandry products in Belgium, Finland, France, Germany, Netherlands, and the UK from about 1870 to 1970.

Finally, Estevadeordal (1997) presents results on the “trade intensity ratio” of 18 countries in 1913. This measure estimates a Heckscher–Ohlin-based structural equation for trade flows based on endowments and compares the sum of predicted bilateral trade flows to the actual trade per country, interpreting the residual as a measure of protection (or openness) for the market of each country.

Recent research has also focused on assessing relative rates for different commodity groups, not overall average measures of protection, as in Tena-Junguito (2010a) and Tena-Junguito et al. (2012), who compare manufacturing tariffs and their potential skill bias for a large sample of countries in the nineteenth century, and O’Rourke and Lehmann (2011), who distinguish between agricultural, industrial, and revenue tariffs.

A different but related literature looks at tariffs for individual goods, sometimes only in one country. The major examples here are the British Corn Laws and their sliding scales (Williamson 1990a; Sharp 2010), discussed in a comparative perspective by Federico (2012b), or the US tariff on cottons (Irwin and Temin 2001) and a possible optimum export tariff on American raw cotton exports (Irwin 2003), a topic also worked on for interwar Egypt (Yousef 2000). That constructing comprehensive and comparable time series for individual tariff rates in the long run is a time-consuming and often complicated task is illustrated by Lloyd (2008), who estimates Australian tariffs on road motor vehicles, blankets, and beer from 1901–1902 to 2004–2005.

Other nontariff barriers to trade like prohibitions, quotas, licenses and capital constraints, import and production monopolies, marketing boards, etc. are normally only included in regression designs via proxies. At least for the period between the dismantling of mercantilist policies in the early nineteenth century and the introduction of all sorts of protective measures in the 1930s, nontariff barriers are generally said to have been small, at least outside a small group of commodities like live animals and meat, where public health concerns sometimes led to trade restrictions. For prohibitions, ad hoc adjustment assumptions have sometimes been made, such as twice the rate when imports started being permitted (Tena-Junguito et al. 2012) or 1.5 times the highest rate in other countries (Lampe 2011). Regarding nontariff barriers in the 1930s, Eichengreen and Irwin (2010, pp. 887–888) provide a summary of the scarce data available on quotas and exchange controls as a part of the trade and payments system. Finally, Ye (2010) investigates the political economy of US trade policy regarding the countries of the Pacific Rim from 1922 to 1962. Other measures of trade policy, like membership of trade blocs or trade agreements and most-favored nation status, have normally been proxied by dummy variables.

Despite the difficulties in defining the extent of trade policy as a simple numerical estimate, we might want to answer what explains it. The consensus seems to be that it emerges mainly as a result of political interest groups reacting to the changes brought by trade on national, local, and industry-specific “initial conditions.” Thus, explaining trade policy involves disentangling the relative importance of these factors. This is normally done through contemplating just one sector or a relevant sample of the industries which are most affected in order to assess the specific impact on them and their reactions alongside the possibilities to affect policy making at the national level. In this sense, the studies by the political scientist Rogowski (1989) and the cliometrician O’Rourke (1997) on the European reaction to the late nineteenth-century grain invasion are outstanding examples of comprehensive trade policy studies, including initial factor endowments, changes in relative prices and factor incomes due to the inflow of cheap grain, formation of coalitions in policy formation, and trade policy outcomes. As Lehmann and Volckart (2011, p. 29) have summarized it, “Kevin O’Rourke […] argued that where agriculture was concerned, the political choices were related on the one hand to how the grain invasion affected land rents, and on the other to the weight of agricultural interests in domestic politics.”

Thus, the key variables to describe agricultural trade policy are (following Swinnen 2009) the weight of agriculture in the economy, the relative income of agriculture, and political institutions and organizations, both as regards the level of democracy and the organization of agricultural interest groups. O’Rourke and Rogowski discuss and evaluate all of them in their comparative framework; Federico (2012b) provides a summary of the relevant forces behind an earlier central episode in agricultural trade policy, the repeal of the British Corn Laws, and parallel and subsequent liberalization of agricultural market access in Continental Europe, thereby summarizing a larger literature with important cliometric contributions (Kindleberger 1975; Bairoch 1989; Schonhardt-Bailey 2006; Montañés Primicia 2006; van Dijck and Truyts 2011). Recently, Lehmann (2010) and Lehmann and Volckert (2011) have studied voting behavior in key elections in Germany in the 1870s and Sweden in the 1880s and found that “agriculture,” including small farmers, peasants, and rural workers, at least in Imperial Germany, voted “en bloc” for protection, hinting at low perceived possibilities for intersectoral mobility in the economy (a “specific factor model”) by large parts of the rural population, as opposed to the opportunities of workers which might be derived from free trade and structural change. For Sweden, the results are less clear, apparently at least in part due to a much more restrictive franchise.

When assessing trade policy of more than one sector, the issue gets complicated by the fact that now not just the level of protection (e.g. on agriculture) has to be taken into account, but also its level in comparison to protection or lack thereof for other sectors, i.e., the structure of trade policy. Thus, the political arena is much more complex. Pahre (2008) has written a whole book on the issue, offering a comprehensive theory of tariff setting, leading to six hypotheses on prices, interest group influence and compensation, country size and transport costs, two corollaries on tariff and price volatility, and several findings regarding the endogeneity and exogeneity of fiscal revenue constraints and their dependence on customs duties and the interplay between democracy and tariff levels. The second step of his theory, regarding bilateral trade policy negotiations, is discussed below.

Blattman et al. (2002), Williamson (2006), and Clemens and Williamson (2012) provide systematic assessment of correlations between a wide set of variables and the “average tariffs,” as measured by AVEs. They find population size (related to relatively low dependence on foreign trade), railroad penetration, urbanization, tariffs of other countries, and tariff autonomy (i.e., political independence versus formal or informal foreign control of trade policy) to be significantly and substantially correlated with tariff levels.

O’Rourke and Taylor (2007) investigate the link between tariffs and democracy and show that the relationship is contingent on the relative factor endowments of the national economy in question. In the case of the nineteenth-century globalization, the land–labor ratio is the most fitting operationalization. Irwin (2008) has highlighted that the use of tariff revenue for infrastructure provision was decisive for the American West to enter into a coalition with the North for high tariffs in the 1820s and 1830s and to swing towards more liberal trade policy later. Eichengreen and Irwin (1995, 2010) have shown that protective tariffs and otherwise restrictive policy can also emerge if no other opportunities for dealing with structural balance of payments deficits are available, in their case the unwillingness to or impossibility of devaluation under the interwar gold standard in the 1930s. Another recurrent aspect, especially in political science, is the importance of “hegemony” (McKeown 1983; Nye 1991; Coutain 2009) or the spread of “ideology” (Kindleberger 1975; Federico 2012b, p. 181). The latter is especially difficult to measure. Finally, Chan (2008) has elaborated and indirectly tested an institutional economic model to explain the trade policy choices of the Chinese Song and Ming dynasties in the light of a trade-off between economic efficiency (and trade tax revenues) and political authority, a question motivated by the famous Needham puzzle of why modern economic growth did not start in China (Lin 1995).

Bilateral or multilateral negotiations to change trade policy have seldom been the subject of cliometric research, and if they have, the focus has been on their impact on trade flows as discussed above. In his book on the “agreeable customs of 1815–1914,” Pahre (2008) formulates nine hypotheses, three corollaries, two remarks, and one conjecture on the likelihood that individual countries cooperate in bilateral trade treaties and finds that, among other things, larger countries and countries with lower tariffs are more likely to cooperate and that “real” exogenous revenue constraints resulting from low fiscal capacity make cooperation less likely, while endogenous (i.e., politically chosen) revenue constraints increase the scope for cooperation. Lampe (2011) offers an assessment of the political and economic determinants of the Cobden–Chevalier network of bilateral MFN treaties in the 1860s and 1870s in the light of both Pahre’s theory and recent contributions by economists Baier and Bergstrand (2004) and Baldwin (1995) as well as the political scientist Lazer (1999), and Lampe and Sharp (2011) use his framework for a cost–benefit analysis of bilateralism, the latter for Denmark, which, despite figuring as a free trader in classical accounts, concluded no substantial trade treaties during this period. In the context of the effects of trade bloc formation in the 1930s, Ritschl and Wolf (2011) and others discuss its origins in the context of evaluating the endogeneity of these blocs and the resulting econometric challenges.

Conclusion

In this chapter we have argued for the importance of trade in economic history, in particular through its impact on growth. Today, domestic sources of growth play a much more important role, but trade might still be important – by establishing constraints, increasing competition, affecting coalitions and institutions, etc.

After discussing how to measure trade and its related concept of market integration, we then went one step back and discussed what factors were behind different examples of trade increases and declines and of market integration and disintegration. Finally, we honed in on trade policy as one of the most important determinants of trade, as well as perhaps the most policy relevant.

The literature is vast, but important questions remain. Moreover, much work is still being done on collecting trade databases and improving our measures of trade costs. The cliometricians of the future will certainly have plenty of opportunities to make important contributions, not only for economic history but for economics in general.