1 Introduction

Understanding the determinants of trade has long been a significant interest in the field of international trade. Most research has focused on the ways in which trade between two countries is affected by the individual and bilateral characteristics of those two countries, such as their market sizes and the frictions between them. However, this research rarely examines other potential dependencies such as the pair’s relationships to third parties, opting instead to summarize these influences in the form of price indexes or “multilateral resistances”. Some recent trade literature has looked at these types of complex trade dependencies through the lens of network analysis. By thinking about trade between two countries as a small component of a much bigger network, it is possible to identify relationships between each bilateral trade flow and other complex network patterns that are typically overlooked.

This paper describes two methodologies for modeling complex network patterns in international trade. These methodologies provide a way to identify and measure the ways in which bilateral trade is dependent on the full structure of the world trade network, such as the common third-party partners that countries share or the total number of trading relationships they maintain. The first methodology extends standard gravity models of trade to include network variables that reflect certain types of network relationships. The second methodology uses exponential random graph models (ERGMs), which are a powerful and flexible empirical tool for studying networks, to estimate similar types of relationships. Both models provide insight into the ways in which network relationships affect the trade between two countries. Further, both models have advantages in terms of the types of patterns that they are able to capture. In the application considered, the ERGM approach better captures the number of non-zero trade partnerships and performs relatively well at modeling shared third-party relationships. Meanwhile, the gravity approach is better able to capture the number of importing and exporting partners that each country maintains. Taken together, these results demonstrate that complex network patterns are an important determinant of trade, that gravity models can capture much of this dependency even if not explicitly controlled for, and that other network models such as ERGMs can be valuable tools for capturing some types of network dependencies.

To illustrate the notion of complex network patterns in international trade, Fig. 1 depicts the trade flows between the United States (USA), El Salvador (SLV), Trinidad and Tobago (TTO), Bolivia (BOL), and Nepal (NPL) in 1999 as a network. A link pointing from one country to another indicates that the originating country exported to the destination country. Consider the exports from El Salvador to Trinidad and Tobago (drawn in red). A typical trade model, such as a gravity equation, will model that trade flow as being dependent on certain unilateral characteristics of the two countries, such as their GDPs or MFN tariffs (or, more commonly, via country fixed effects), and bilateral characteristics, such as the distances between the markets or their preferential trade arrangements. However, there are other relationships in the network that might also affect this trade flow. For example, Trinidad and Tobago export to El Salvador, implying that their trade is reciprocal. Similarly, both countries have multiple other import and export partners, which may create network externalities in both markets. Finally, both countries trade with common third parties, such as the United States, which might impact trade though things like supply chains or information spillovers. Network relationships are even more prevalent and extensive when considering the full world trade network, as is presented in Fig. 2. These types of network patterns have a significant impact on trade but are rarely analyzed directly within trade models, resulting in important nuances in how trade flows arise being often overlooked.

Fig. 1
figure 1

Trade flows between the United States (USA), El Salvador (SLV), Trinidad and Tobago (TTO), Bolivia (BOL), and Nepal (NPL) in 1999

Fig. 2
figure 2

The World Trade Network (2006)

In recent years, research on the determinants of bilateral trade has largely been based on the estimation of gravity trade models. Through this work, authors have identified the importance of many types relationships within international trade. Much of this research has worked to properly identify the effects of a wide range of bilateral dependencies such as distance, common borders, common languages, and cultural ties. For example, Rauch (1999)—who actually referred to these types of bilateral relationships as network relationships—found broad evidence that they have a significant impact on trade. Following this work, many papers have built on these findings by providing deeper analysis and alternative measurements for each of these bilateral relationships. For example, Brun et al. (2005) and Berthelon and Freund (2008) examine the role of geographic distance between countries. Hutchinson (2005), Ku and Zussman (2010), Melitz (2008), and Melitz and Toubal (2014) study the ways in which countries relate through common language networks. Rauch and Trindade (2002), Linders et al. (2005), Hofstede (1980), and Felbermayr and Toubal (2010) analyze the effect of cultural networks on trade.

In addition to modeling bilateral dependencies, most recent gravity research has attempted to control for a broad number of other dependencies through the incorporation of multilateral resistance terms (MRTs). Originally introduced by Anderson and van Wincoop (2003), MRTs capture unobserved aspects of countries that affect the relative prices of traded goods in each market. Part of the beauty of these terms is that they compress an extraordinary number of influences into a single importer and exporter price index for each country. MRTs implicitly reflect aspects of the world trade network and many of the dependencies therein. For example, Bernard and Moxnes (2018) highlight the link between networks and similar international price indexes in their Melitz (2003) inspired network trade model, which is informative of the influences that are likely present within multilateral resistances. The limitation of using MRTs or price indexes, however, is that this simplification naturally abstracts away from their underlying factors. Thus, when estimated empirically using standard approaches, little insight can be gained into the nature of potential network influences.

In light of this, many recent papers have looked at the structure of the international trade network in order to gain better insight into the ways that network dependencies shape trade patterns.Footnote 1 Most of this work is primarily focused on identifying certain patterns in the trade network and not necessarily on how trade is dependent on those patterns. For example, work by De Benedictis and Tajoli (2011), De Benedictis et al. (2013), and Deguchi et al. (2014) all identify typical network features of world trade such as density, clustering, and centrality measures. De Benedictis and Tajoli (2011) compare centrality measures that reflect how well connected nodes are across a variety of countries or regions and identify the countries that operate as major trade “hubs”. Among their findings, the authors show that the WTO was effective in increasing the density of the trade network (i.e. increasing the number of country-pairs trading). De Benedictis et al. (2013) provide a similar but deeper analysis of several centrality measures. They find that degree centrality, which reflects how well connected a given country is, may be a strong indicator of trade surpluses or deficits. Closeness centrality or geodesic distance, which both reflect the minimum number of links separating two countries, can measure how directly connected a country is to the rest of the world. Finally, eigenvalue centrality, which reflects how well connected a country’s partners are, can offer insight into the role of indirect relationships among trading partners. Deguchi et al. (2014) follow this line of research on centrality by ranking countries based on their centrality. Using this ranking, they observe changes in the positions of countries within the list and find, for example, that China has grown to become the highest value trade hub while Japan has dropped in ranking over time. These papers and others analyzing trade using empirical network methods offer an important perspective on trade that is typically not apparent from standard modeling approaches.

Other recent research has looked at relationships between the trade network and other types of networks. For example, Fagiolo and Mastrorillo (2013a) find significant correlations between the world trade network and immigration networks, which is consistent with the earlier but less network-intensive work of Rauch and Trindade (2002). Pan (2018) examines interdependencies between trade and intergovernmental organizations using ERGMs. They find that trade agreement membership is associated with more complex trading relationships but membership in many other types of organizations is not. Smith et al. (2019) use micro-level networks of firm ownership to identify complex relationships between firm behavior and country-level trade. Each of these papers finds significant evidence that trade networks depend on the structure of other networks.

Some papers have gone beyond the statistical analysis of trade patterns and incorporated types of network dependencies into models of trade. One such example is the work by Chaney (2014). Chaney utilizes a network model to describe how firms expand to new export markets by using existing trade partners to match with new, more distant partners. If the distance between firms increases the difficulty of them matching, firms may use current partners to connect to distant firms in order to reduce that barrier and shorten the effective distance to the new firm. Using firm-level French data, Chaney finds that firms trade to increasingly distant markets at an accelerating rate, providing evidence that firms utilize network relationships to trade. A second example is the work of Morales et al. (2019), who model a different type of network dependency that they refer to as extended gravity. They find that bilateral trading relationships between two partners often create spillovers for third parties. Firms tend to import from or export to countries that are similar to ones with which they have prior experience. Thus, trade between partners is often impacted by the network of third-party trade links maintained by each partner. Their empirical tests suggest that these network dependencies are present and reduce trade costs significantly between partners.

Of particular relevance is the work of Dueñas and Fagiolo (2013), who study properties of the world trade network through a gravity framework. The authors estimate standard gravity models and use the estimated parameters to predict link formation and generate simulated trade networks. These trade networks are compared to observed trade networks in order to determine if gravity models are capable of explaining customary topological features of trade networks. They find that gravity models are often effective at replicating some aspects of trade networks such as the average number of trading partners each country maintains. However, they perform poorly at predicting more complex patterns such as clustering unless the presence of links is fixed and only the weights are predicted. Much of their difficulty in generating similar networks stems from a general inability to accurately replicate binary link formation.Footnote 2 My work builds on that of Dueñas and Fagiolo (2013) in several ways. It augments standard gravity frameworks by including network-based covariates in an effort to better capture certain network influences. It also compares gravity simulated networks to those produced by ERGM models and demonstrates the relative strengths and weaknesses of both.

Ward et al. (2013) provide what is likely the most direct study of complex network influences in international trade. Similar to the present paper, the authors assert that dependencies exist between bilateral trade and the patterns of the entire network. However, rather than the ERGM methodology proposed here, they model these dependencies using a general bi-linear mixed effects model (GBME) based on the work of Hoff (2005). A GBME model studies the structure of a network through a process similar to ANOVA. Links between nodes are estimated such that the error terms are modeled as being composed of country-level random effects. Using these variance decompositions, many types of network dependencies can be identified, including reciprocity, sender and receiver effects, and shared third-party effects. Ward et al. find that the inclusion of complex network patterns improves the explanatory power of the gravity model and results in significantly higher \(R^2\) values.

The work in this paper extends this line of research studying complex network dependencies in trade. I propose two methodologies for modeling these dependencies that each exhibit some advantages. The first uses gravity models in a similar vein as Dueñas and Fagiolo (2013). I modify standard gravity models to include a collection of variables reflecting aspects of the world trade network. This is done for two types of models. The first is a typical gravity model using bilateral trade values and a PPML estimation approach. The second examines only the presence of trade and not its value, similar to the extensive-margin stage of the two-stage gravity model of Helpman et al. (2008) or the analysis of Baldwin and Harrigan (2011). The first approach more closely follows the work in the gravity literature while the second approach is more representative of much of the work in the network literature, which often focuses on the presence of trade rather than the value. The second approach also compliments a notable weakness of the first, which is that most gravity models of trade values—despite being effective at modeling bilateral trade overall—are largely unable to predict zero trade occurrences. Both types of gravity models provide evidence that bilateral trade patterns are significantly influenced by certain network patterns. I find that reciprocity between countries, common third-party trading partners, and the number of import and export relationships maintained by each country are significant determinants of bilateral trade. In most cases, these empirical findings are consistent with many of the theoretical predictions of the past literature.

The second methodology uses an empirical network approach known as an exponential random graph model. ERGMs are relatively new to the field of economics but have gained some popularity in recent years. For example, both Pan (2018) and Smith et al. (2019) used ERGMs to models aspects of trade. ERGMs estimate the likelihood of each bilateral trade relationship forming within the network, conditional on the structure of the rest of the network and other factors. Parameters relating to the assumed dependencies are estimated using a maximum likelihood approach. The estimated model is that which makes the observed trade network the most likely network to have formed from among all possible trade networks. Compared to other network methodologies, such as the the GBME models used by Ward et al. (2013), ERGMs offer a great deal of flexibility in terms of the types of network dependencies that they can include. GBME models identify network dependencies by decomposing error terms into specific functional forms, which necessitate a considerable amount of model structure. ERGMs, by comparison, include network dependencies in an additive way, making it easier to alter the types of dependencies included in the model. Thus, while there may be considerable overlaps in the objectives of the present paper and Ward et al. (2013), the work presented here intends to not only provide additional affirmation of the importance of network dependencies in international trade but also demonstrate alternative methods for studying them. The ERGM estimations provide new evidence that network patterns significantly affect international trade. The estimates show that reciprocity and shared third-party trading partners are significant determinants of bilateral trade, similar to the results from the gravity models.

In addition to estimating the impacts of network dependencies using both gravity models and ERGMs, the models are tested on their ability to reproduce the complex network patterns present in actual trade. Estimated models based on both methodologies are used to simulate samples of predicted trade networks, similar to what is done by Dueñas and Fagiolo (2013). These samples of trade networks are then compared to the actual observed trade network to determine which methodology better captures and replicates important types of network patterns. The tests demonstrate that both approaches perform better at replicating different aspects of the world trade network. In the application considered, the ERGM approach better captures the number of trade links in the network and performs relatively well at replicating the way countries share third-party partners. The gravity approach better captures the number of countries with which each country trades as well as the distance between countries in terms of links. These results indicate that (i) ERGMs are a powerful tool for analyzing aspects of international trade and (ii) structural gravity models are already effective at capturing many complex network patterns. By tying together the gravity literature and the empirical network literature through these comparisons, this work represents a novel contribution to both.

The ability to properly model complex network patterns in trade is important. From an ex post perspective, these patterns are influential determinants of trade behavior and integral in understanding bilateral trade. From an ex ante perspective, being able to properly model trade networks and, in particular, the extensive margin of trade is crucial. For example, there is a growing recent literature using structural gravity models to conduct counterfactual analyses of trade policy and other phenomenon (c.f. Yotov et al. (2016), Anderson and Yotov (2016), Baier et al. (2019), Kohl (2019), and Brakman et al. (2018)). These models typically rely on PPML estimations, which—as discussed above—are unable to adequately model zero trade flows and the extensive margin. The network based approaches that I describe could be a useful refinement to these methods and provide a means by which to predict non-trading countries in a counterfactual scenario.

The remainder of the paper proceeds as follows. Section 2 presents the gravity approach for modeling network dependencies in trade. Section 3 presents the ERGM approach. Section 4 compares both methodologies’ ability to replicate complex network patterns. Section 5 concludes.

2 Network dependency in gravity models

Following the early work of Tinbergen (1962), gravity models have rapidly become workhorse tools in international trade. Gravity models gained their modern theoretical foundations with the work of Anderson and van Wincoop (2003) and Eaton and Kortum (2002), who provided demand and supply side structural derivations of the model, respectively. On the demand side, Anderson and van Wincoop’s most notable contribution was the introduction of MRTs, which capture country-level multilateral factors that impact bilateral trade for each importer and exporter. These terms are expressed as price indices that reflect inward and outward aggregate trade costs for each importer and exporter, respectively. The canonical Anderson and van Wincoop (2003) gravity model is given by the following system:

$$\begin{aligned} x_{ijt} &= \frac{Y_{it}E_{jt}}{Y_t}\left( \frac{\tau _{ijt}}{P_{jt}\varPi _{it}}\right) ^{-\theta }, \end{aligned}$$
(1)
$$\begin{aligned} \varPi _{it}^{-\theta _{}}= & {} \sum _{j}\left( \frac{\tau _{ijt}}{P_{jt}}\right) ^{-\theta }\frac{E_{jt}}{Y_t}, \end{aligned}$$
(2)
$$\begin{aligned} P_{jt}^{-\theta }= & {} \sum _{i}\left( \frac{\tau _{ijt}}{\varPi _{it}}\right) ^{-\theta }\frac{Y_{it}}{Y_t}. \end{aligned}$$
(3)

In the model, \(x_{ijt}\) denotes trade from exporter i to importer j in period t. Trade is a function of j’s output (\(Y_{jt}\)), i’s expenditures (\(E_{jt}\)), world output (\(Y_t\)), bilateral trade costs (\(\tau _{ijt}\)), and the outward and inward multilateral resistances (\(\varPi _{it}\) and \(P_{jt}\), respectively). A key innovation in Anderson and van Wincoop’s work is that these MRTs summarize the wide range of possible multilateral factors into a single index value that can be readily constructed or estimated in empirical models. However, the downside of this simplification is that the intricacies of the underlying factors are lost. It is these types of intricacies that I seek to better illuminate by empirically identifying the role of network patterns within the model.

Subsequent work has expanded on these theoretical foundations and demonstrated the breadth of micro models that can be used to derive gravity models (Arkolakis et al. 2012). In particular, some of this research has tied gravity to network patterns in trade. Most notably, both Chaney (2014) and Morales et al. (2019) derive gravity models based on network dependencies. These works highlight the intrinsic connections between gravity models and certain network patterns in trade. It is in this vein that I present new methods for extending the standard workhorse gravity model to include network variables that identify relationships between bilateral trade and complex network patterns. Although I forgo presenting a new theoretical gravity model, the extensions I propose are consistent with these existing structural models and provide new empirical insight into their underlying influences.

I present two approaches for estimating the effects of network patterns in a gravity framework. The first follows a typical modern approach in which trade values are analyzed using a Poisson Pseudo Maximum Likelihood (PPML) estimator (Santos Silva and Tenreyro 2006). The second approach looks only at the extensive margin by focusing on the existence of trade between two countries rather than the level. This second approach is considered for a few reasons. First, it can be more readily compared with the ERGM model described in Sect. 3, which also considers only the presence of trade and not the value. Second, the modeling of the extensive margin in trade is something that both the standard PPML-based gravity approach, which cannot predict trade values of zero, and past network and trade research, such as that by Dueñas and Fagiolo (2013), have struggled to accurately capture. Third, it provides novel insight into aspects of trade that are often overlooked in studies using trade flows. For example, Iapadre and Tajoli (2017) note that a relatively small number of countries account for the majority of trade value and that treating trade as a binary event can help increase the voice of smaller countries.

The extensive margin approach follows the models of Baldwin and Harrigan (2011) and Helpman et al. (2008). It estimates the likelihood of trade between two countries using a probit model and standard gravity covariates such as distance, common borders, and trade agreements. While the Helpman et al. approach to estimating gravity has been shown to be problematic by Santos Silva and Tenreyro (2015), its noted limitations are primarily related to the integration of the extensive and intensive margin estimations in their two-stage model. I perform only the extensive margin estimation and therefore avoid the empirical limitations that Santos Silva and Tenreyro describe.

The two types of gravity models are estimated in order to identify the extent to which network attributes influence bilateral trade. In addition to standard gravity covariates like distance, common language, and trade agreements, I add three types of network covariates. The network covariates reflect different types of network patterns relating to each importer and exporter. The network variables are defined based on the presence of trade rather than the value of trade, which I denote by \(T_{ijt}\). If trade flows from i to j are non-zero, \(T_{ijt} = 1\); if the countries do not trade, \(T_{ijt} = 0\). The first network attribute is an indicator for reciprocal trade. This reciprocal variable (\(RECIP_{ijt}\)) takes the value of one if the reverse flow from country j to i is present in the network (\(T_{jit} = 1\)), meaning that the countries mutually import from and export to each other in a given year. The second class of network attributes, consisting of two variables, reflects the presence of three-way trade. The first of these variables counts the number of transitive trading relations shared by i and j. A transitive relationship occurs if there exists a country k such that i exports to k, k exports to j, and i exports to j. The constructed variable (\(TRAN_{ijt}\)) counts the number of countries k for which the existence of trade from i to j would complete a transitive triple. Similarly, the second three-way variable (\(CYLC_{ijt}\)) counts the number of cyclical relationships, which consist of countries k such that i exports to j, j exports to k, and k exports to i.Footnote 3 Finally, the third class of network attributes consists of degree measures that reflect the number of trade relationships maintained by i and j. Four specific measures are considered: the importer in-degree (\(IID_{jt}\)), which counts the number of countries from which j imports; the importer out-degree (\(IOD_{jt}\)), which counts the number of countries to which j exports; the exporter out-degree (\(EOD_{it}\)), which counts the number of countries to which i exports; and the exporter in-degree (\(EID_{it}\)), which counts the number of countries from which i imports. Table 1

Table 1 Network attributes used in the gravity estimations of trade

summarizes these attributes. A deeper discussion of network patterns is provided in Sect. 3.1

Because the network variables are based on trade flows, some discussion of potential endogeneity issues is warranted. The variables have been constructed in a way that for a trade flow \(x_{ijt}\) from i to j, none of the constructed variables utilize information about \(x_{ijt}\) itself. In the case of reciprocity and three-way trade, the terms are equivalent regardless of the value of \(x_{ijt}\), as they are based solely on other trade flows. The same is true of exporter i’s in-degree and importer j’s out-degree term. The remaining terms, exporter out-degree and importer in-degree, are in principle impacted by the value of \(x_{ijt}\). However, to avoid this endogeneity, I exclude \(T_{ijt}\) in the construction of these terms at the bilateral level and include j and i subscripts in \(EOD_{ijt}\) to \(IID_{ijt}\), respectively. Given these changes, none of the network variables are directly dependent on \(x_{ijt}\) and are consistent with the multilateral information that would typically be captured by country-year fixed effects.

Trade data and additional gravity covariates were sourced from the BACI bilateral trade and gravity data sets provided by CEPII (see Gaulier and Zignago (2010) and Head et al. (2010), respectively). The sample covers 207 countries for the years 1995 to 2006. The data used for estimation are expanded to be a square panel so that “zeros” are also included for countries that do not trade. The gravity covariates used throughout are the standard measures of (log) distance (\(ln(DIST_{ij})\)), contiguity (\(CNTG_{ij}\)), common language (\(LANG_{ij}\)), colonial ties (\(CLNY_{ij}\)), and trade agreements (\(RTA_{ijt}\)).

2.1 PPML gravity estimation of trade flows

I begin by estimating the effects of network patterns in a conventional gravity framework. The model follows standard specifications such as those described by Yotov et al. (2016) and Head and Mayer (2014). The estimating model takes the following form:

$$\begin{aligned} x_{ijt} = exp\{ D_{ijt}\alpha + \beta _1 RECIP_{ijt} + \beta _2 TRAN_{ijt} + \beta _3 CYLC_{ijt}+ \mu _{it} + \nu _{jt}\} + \epsilon _{ijt}. \end{aligned}$$
(4)

As before, \(x_{ijt}\) denotes exports from country i to country j in year t. The term \(D_{ijt}\) represents a collection of typical gravity covariates as described above. Following the gravity literature, exporter-year and importer-year fixed effects, denoted by \(\mu _{it}\) and \(\nu _{jt}\), respectively, are included to capture multilateral resistances (Hummels 1999; Feenstra 2002). These terms control for important structural factors that influence the inward and outward prices in each country. However, the inclusion of these fixed effects precludes the inclusion of several of the network covariates that are similarly defined at the country-year level. As such, only reciprocity, transitive triples, and cyclical triples are included in this model specification. The remaining network covariates reflecting the different degrees measures are considered separately in a second stage regression, which I describe below. Finally, \(\epsilon _{ijt}\) is an error term. The model is estimated in it’s non-linear form using a PPML estimator, as suggested by Santos Silva and Tenreyro (2006). PPML offers two notable advantages. First, it allows for the inclusion of zero trade flows. Second, it helps to mitigate heteroskedasticity issues that are common in bilateral trade data.Footnote 4

Table 2

Table 2 Gravity model estimates of network influences on bilateral trade flows

presents the results of the PPML estimation. Column (1) presents a baseline specification that includes only the standard gravity controls and country-year fixed effects. All estimates are consistent with typical estimates such as those presented in the gravity survey of Head and Mayer (2014). The one exception is colonial ties, which is not statistically significant in this case. Column (2) adds the three network covariates. The reciprocity term has a large, positive, and statistically significant impact on trade, suggesting that trade tends to be reciprocal and mutually reinforcing. A country is likely to export more to countries from which it also imports, and vice versa. The estimate indicates that reciprocal trade flows are about 537 percent higher on average than non-reciprocal flows.Footnote 5 The cyclical trade term is also positive and significant at the 90 percent level, implying that trade in three-way cycles tends to be larger. Each additional three-country cycle increases bilateral trade by about 0.55 percent. This outcome is consistent with the prominence of global supply chains in which products cycle through different stages of production in multiple different countries. The estimate for transitive trade is not significant. However, the transitive and cyclical measures are highly correlated due to the fact that both partially reflect general connectivity to other markets. To better investigate these covariates, columns (3) and (4) include each of these variables independently. In both cases, the variables are insignificant, suggesting that controlling for transitive patterns is important for identifying the effects of cyclical patterns. Finally, in all cases, the inclusion of these network factors has a negligible impact on the estimates of the standard gravity variables, suggesting that these variables are picking up new determinants of trade.

In order to identify the effects of exporter and importer degree patterns, I conduct a second stage regression that estimates the relationship between the estimated country-year fixed effects and the country degree patterns.Footnote 6 In addition to highlighting the empirical connection between these factors, the fixed effects have structural interpretations in the gravity model because they reflect multilateral resistances (Fally 2015). Thus, relationships between degree patterns and the fixed effects can also be interpreted as relationships with multilateral resistances. The analysis is conducted using the fixed effect estimates from the specification presented in column (2) of Table 2. Specifically, I estimate the following model:

$$\begin{aligned} \phi _{kt} = \delta + \gamma _1 ln(GDP_{kt}) + \gamma _2 ln(GDPPC_{kt}) + \gamma _3 GATT_{kt} + \gamma _4 ln(REMT_{kt}) + \gamma _5 ID_{kt} + \gamma _6 OD_{kt} + \xi _{kt} \end{aligned}$$
(5)

The term \(\phi _{kt}\) represents either the importer or exporter fixed effects, \(\hat{\mu }_{jt}\) or \(\hat{\nu }_{jt}\), which are regressed in separate specifications. For the independent variables, I include several factors that are known from the literature to contribute to gravity model fixed effects. First, GDP is used to control for market size and output/expenditure, which are structural components of the gravity model that are absorbed by the fixed effects. Second, GDP per capita (\(GDPPC_{kt}\)), GATT/WTO membership (\(GATT_{kt}\)), and a measure of remoteness (\(REMT_{kt}\)) are included as factors have long been used in the literature to proxy for aspects of multilateral resistance.Footnote 7 Third, the measures of inward and outward degrees are included (\(ID_{kt}\) and \(OD_{kt}\), respectively).Footnote 8 For the regressions using importer fixed effects, these terms are \(IID_{kt}\) and \(IOD_{kt}\). When using exporter fixed effects, they are \(EID_{kt}\) and \(EOD_{kt}\). Finally, \(\delta \) is a constant and \(\xi _{kt}\) is an error term. The model was estimated using OLS.

Because the dependent variable in the second stage is based on estimates derived from the first stage regression, additional care must be taken to account for errors introduced from the first stage. Lewis and Linzer (2005) provide a thorough discussion of the potential issues that can arise in these situations. They note that the use of estimated dependent variables in the second stage can introduce heteroscedasticity stemming from sampling error in the first stage. This may result in inconsistent standard errors in the second stage. As a remedy, Lewis and Linzer (2005) find that White’s (1980) heteroscedastic-consistent standard error estimates provide reliable and consistent values, although such an approach may be inefficient. I follow this approach for the second stage gravity estimates and report heteroscedastic-consistent standard errors.Footnote 9

The second stage estimates, which are presented in Table 3,

Table 3 Estimates of network influences on gravity model fixed effects

demonstrate that that importer and exporter degree pattern have an influence on fixed effect estimates and multilateral resistance. Columns (1)–(3) depict the estimates for the exporter fixed effects (\(\hat{\mu }_{jt}\)) while columns (4)–(6) depict those for the importer fixed effects (\(\hat{\nu }_{jt}\)). Columns (1) and (4) provide baseline models that include the standard components GDP, GDP per capita, GATT/WTO membership, and remoteness. Recall that the fixed effects are increasing in a country’s level of openness to trade so that most of the standard terms are of the expected sign. GDP and GDP per capita both increase imports and exports in all specifications. However, both remoteness and GATT/WTO membership produce some estimates with unexpected signs. Nonetheless, these counter intuitive results are consistent with past literature. With regards to remoteness, Baldwin and Harrigan (2011) find positive effects of remoteness on trade when examining the likelihood of trade at the extensive margin. With regards to GATT/WTO membership, past literature has often found that GATT and WTO membership has had mixed empirical effects that are sensitive to the econometric specification in which they are estimated (c.f. Larch et al. (2019a)).

Columns (2) and (5) introduce the degree terms to the baseline models. The estimates highlight an interesting dynamic between import and export behavior. For the exporter fixed effects depicted in column (2), the estimates indicate that export openness tends to be positively related to the number of countries from which it imports but is negatively related to the number of countries to which it exports. The estimates for the importer fixed effects in column (5) present a similar picture. Import values overall are increasing in the number of sources from which a country imports but decreasing in the number of destinations to which it exports. These estimates suggest that exporting is based on strong relationships with specific partners rather than a large number of weaker partnerships. By comparison, a country’s import values tend to be be reinforced by the number of sources from which it imports. Finally, it is possible that remoteness and degrees are meaningfully related in this context, as both seek to measure connectedness, which may impact the estimates. To examine this potential overlap, columns (3) and (6) report specifications in which remoteness is omitted. The degree estimates are largely unaffected, suggesting that they are identifying unique aspects of global connectedness.

2.2 Probit model of the extensive margin

The previous section demonstrates the influences that network patterns have had on bilateral trade relationships through the lens of trade values and structural gravity estimation. In this section, I consider a second perspective that looks at the role of networks in the extensive margin of trade. While the modeling of trade as a binary occurrence is less prominent in the literature, it has a particular relevance when focusing on the relationships between network patterns and trade. Even though the PPML estimator has grown in popularity due in no small part to its ability to include zero trade flows in estimations, PPML gravity models are unable to predict actual zeros post estimation. Instead, PPML can only predict small trade values where actual trade data shows zero trade flows. For example, the standard gravity model presented in Table 2 predicts no literal zeros despite the fact that about 50 percent of the trade flows in the sample are zero. Thus, new insight into the determinants of zero flows provides valuable complementary information. Additionally, examining the extensive margin of trade also connects this work with much of the past trade and network research, which has similarly focused on the extensive margin of trade.

The analysis is conducted using a gravity probit model that regresses the existence of trade against the collection of gravity and network covariates used in the standard gravity model above. This approach follows the earlier extensive margin analyses of Helpman et al. (2008) and Baldwin and Harrigan (2011). Specifically, the model takes the following form:

$$\begin{aligned} \begin{aligned} T_{ijt} =&\; D_{ijt}\alpha + \beta _1 RECIP_{ijt} + \beta _2 TRAN_{ijt} + \beta _3 CYLC_{ijt} + \beta ^4EOD_{ijt} \\&\; + \beta ^5EID_{it} + \beta ^6IID_{ijt} + \beta ^7IOD_{jt} + F(i,t) + H(j,t) + \epsilon _{ijt}. \end{aligned} \end{aligned}$$
(6)

In this case, the dependent variable \(T_{ijt}\) denotes the presence of trade rather than the value, taking a value of 1 if \(x_{ijt} > 0\) and zero otherwise. The other differences in this specification are the exporter and importer controls denoted here by the functions F(it) and H(it), respectively. These terms represent three different vectors of control variables that are used across different specifications: exporter-year and importer-year fixed effects; exporter, importer, and year fixed effects; and country-year level controls comprised of GDP, GDP per capita, and the remoteness measure. While the country-year fixed effects provide the best controls, they also conflict with the network degree terms. Thus, the two alternative controls are used in specifications that include network degrees. The models were all estimated using a probit estimator. However, given the use of extensive fixed effects in some specifications, additional linear probability model regressions were used to test the robustness of the estimates and are presented in the “Appendix”. In general, the linear probability model estimates are consistent with the probit estimates other than a few exceptions described below.

Table 4

Table 4 Probit model estimates of network influences on the extensive margin of trade

presents the probit model estimates, which provide a similar picture of the significant but nuanced impacts of network patterns on trade. Column (1) provides a baseline specification that includes the gravity covariates and country-year fixed effects but does not include any of the network covariates. The estimates are largely consistent with the past literature and exhibit the expected effects on trade. Only contiguity, which is negative here, differs from typical gravity estimates but is consistent with the extensive margin findings in Helpman et al. (2008). Column (2) adds in the reciprocal, transitive triple, and cyclical triple terms while retaining the country-year fixed effects. As in the PPML gravity model, both the reciprocal and cyclical triple terms have a positive and significant effect on the extensive margin, implying that reciprocity and cyclicity increase the likelihood that two partners trade. Transitive triples have a significant negative impact on the extensive margin, suggesting that trade does not tend to form in transitive patterns. Curiously, this conflicts with the theoretical expectations of Chaney (2014), which suggest that firms use existing, transitive relationships to expand their exports. However, it should be noted that the present analysis uses aggregate trade rather than firm-level trade, which may explain these differences at least in part. Additionally, as discussed below, this estimate appears to be quite sensitive to the specification and should be treated with some caution.

Columns (3) and (4) introduce the exporter and importer degree terms. Because the degree terms are defined at the country-year level, the country-year fixed effects are forgone and are instead replaced with time-invariant exporter and importer fixed effects as well as year fixed effects. Columns (3) and (4) also provide two different treatments for the eight covariates included in the specification in column (2). The first, which is presented in column (3), re-estimates these covariates. The second, which is presented in column (4), constrains these estimates to the values that were derived using the more granular set of country-year controls from column (2). In most cases, these two approaches produced similar estimates, differing slightly in magnitude. The most notable difference is that the transitive triple term is positive and significant in column (3), and therefore more consistent with the predictions of Chaney (2014). The degree terms in both specifications depict a complex set of influences that differ in some ways from those estimated within the value based gravity models above. The importer in- and importer out-degrees present the same pattern as before. The likelihood of importing from a particular source is increasing in the number of sources from which a country imports. Meanwhile, it is decreasing in the number of destinations to which an importer exports. By comparison, the pattern for exporters is reversed. The likelihood of a country exporting to a particular partner is increasing in the number of its export destinations overall. However, the likelihood is decreasing in the number of countries from which it imports. Thus, while there was a tendency for firms to export narrowly and import broadly when considering trade values, the same does not appear to be true at the extensive margin. Instead, countries tend to either export broadly and import narrowly, or vice versa, seemingly specializing as either well-connected importers or exporters. This difference may imply that for importers, the extensive margin and intensive margin may look similar, with imports spread relatively evenly across the different sources. For exports however, these margins appear fairly different. It may be that having many export destinations helps to form new trading relationships in general but that most trade value tends to flow from exporters with relatively few destinations.

Columns (5) and (6) present an additional set of specifications that forgo the inclusion of country fixed effects in favor of the country controls used in the PPML fixed effect regressions: GDP, GDP per capita, and remoteness. Using these control terms reintroduces time variation at the country-level but controls for a more narrow set of influences. As before, column (5) re-estimates all covariates while column (6) constrains several values based on the estimates in column (2). The use of the country controls in place of fixed effects has only a modest impact on the other estimates. The signs of almost all the other covariates are unchanged and the magnitudes differ by only a small amount. In particular, the exporter and importer degree patterns are the same under these alternative country controls, providing additional evidence of their robustness. The most notable difference compared to the other specifications is that the estimate for transitive triples in column (5) is negative, further demonstrating the sensitivity of that estimate to the model specification.

To test the robustness of these findings, I estimate the same specifications using a linear probability model. This is motivated by the inclusion of extensive fixed effects in several of the specifications, which could result in incidental parameter problems for those estimates. The linear probability estimates, which are presented in Table 7 in the “Appendix”, are largely consistent in sign with the probit estimates. In particular, the patterns of the importer and exporter degree terms are robust to linear probability estimation. The most notable exceptions are that contiguity and trade agreements become significantly negative in most specifications. Additionally, the transitive triple term is less sensitive to the specifications when using the linear probability model as it is negative across all specifications. By comparison, the cyclical triple term shows additional sensitivities as it fluctuates between positive, negative, and insignificant across the specifications. Despite these inconsistencies between the two estimators, I believe that the probit estimates provide better insight as the probit estimator is better suited for estimating the binary outcomes of the extensive margin of trade. Further, I see the fact that the estimates for trade agreements are more sensible in the probit specifications as additional support.

Several recent papers examining the extensive margin of trade have used the Flex estimator proposed by Santos Silva et al. (2014). The Flex model represents an effective means of estimating the determinants of the extensive margin when the margin is defined in terms of the number of different products traded instead of as a binary measure. As a robustness exercise, I perform a complementary analysis using the Flex estimator and alternative concept of the extensive margin. The details of the model and results are presented in the “Appendix”. The Flex model results are mostly consistent with those of the probit model.Footnote 10 In particular, the estimates for the network terms agree in sign in all cases and demonstrate that the network influences are similar for both binary trade and the number of products traded.

Together, the PPML and probit models indicate that the complex patterns in the world trade network are influential in ways that are not obvious from a typical gravity model. These findings provide motivation for exploring these dependencies in an analytical framework that is specifically designed to model the structure of the network. The next section describes such a framework.

3 Exponential random graph models of trade

As demonstrated in the previous section, global trade patterns depend on the structure of the trade network. A country’s trade decisions are impacted by its position in the network and the patterns to which it is a part. The gravity approaches described above provide one way to identify and estimate the impact of these types of network patterns. This section describes an alternative network approach called ERGM analysis. This alternative approach provides a modeling framework specially developed to explain the formation of complex trade patterns and network dependencies. Thus, the modeling of trade networks through such an approach offers a means to estimate aspects of the trade network that are not readily identified in a gravity framework.

3.1 Exponential random graph models

A network consists of a collection of nodes and links that indicate relationships between these nodes. A significant motivation for the use of networks stems from the fact that this broad framework can be used to express a wide variety of economic environments in which the pattern by which agents relate to one another has a consequential bearing on behavior. Within the context of international trade, networks can be used to describe complex trading relationships in which trading partners are represented by nodes in the networks and links can be used to describe a wide range of relationships such as trade flows, common languages, and shared borders. By studying the structure of these networks, considerable information can be gained about the patterns of trade.

A network G can be represented mathematically with relative ease. Let N denote the set of nodes in a network and \(n_i\in N\) denote a specific node within that set. Nodes are connected by links \(x_{ij}\) such that \(x_{ij}\) exists if there is an link extending from node \(n_i\) to node \(n_j\). Networks can be unweighted, in which case \(x_{ij}= 1 \) indicates the presence of a link and \(x_{ij} = 0\) indicates its absence. They can also be weighted, in which case \(x_{ij}\) specifies not only the existence of a link but its value. Furthermore, a network can be either directed, in which case links \(x_{ij}\) and \(x_{ji}\) are distinct, or undirected, in which case \(x_{ij} \equiv x_{ji}\).

In the context of international trade, networks exhibiting a variety of these characteristics are common. For example, the extensive margin of trade could be sufficiently modeled using an unweighted network in which links represent the existence of trade between partners. However, a study of the intensive margin of trade would require the use of weighted networks in which links describe the actual volume of trade between both partners. In both cases, the network would generally need to be directed because exports from country i to country j are distinct from the exports from j to i. By comparison, a network depicting the presence of a shared common border between partners could be described by an undirected network.

It may also be the case that a set of nodes N are related by more than one network. For example, countries are linked through a considerable number of possible networks, such as trade, common languages, or regional trade agreements. In what follows, these different networks will be denoted using alternative variables to represent links in each network. For example, the set of links X and Y may be used to denote trade flows and common language ties, respectively.

In addition to a range of different types of links that exist between nodes, nodes may also feature node-specific characteristics. For each node \(n_i\), there may exist a corresponding set of traits \(Q_i\) with typical elements \(q_i^\rho \). If the nodes represent countries, the set of node traits may include information such as GDP, GDP per capita, or WTO membership. One motivation for including node characteristics is that it allows for the study of “social” influences. For example, countries belonging to the World Trade Organization may be expected to trade more with other members than with non-members.

Given the unique ability of network structures to convey numerous dimensions of information, they yield themselves to a variety of powerful analytical options. ERGMs are one such way in which to study the structure of networks by identifying the specific aspects of a network that result in the likely formation of the networks that are ultimately observed. Beginning with the seminal work of Frank and Strauss (1986), ERGMs have become increasingly popular in the analysis of networks, predominantly in the areas of psychology, sociology, and statistics. More recent work, such as that by Wasserman and Pattison (1996), Snijders (2002), Robins et al. (2007), and Lusher et al. (2013), has expanded on this framework and created a robust set of analytical tools with which to study networks.

The ERGM methodology views a network as a realization of a random variable. Networks are drawn from a distribution of possible networks such that the distribution is dependent on certain network attributes that will be described in greater detail shortly. Given these attributes and the implied distribution, some networks are more likely than others. Statistical inference on a particular observed network is possible by estimating the characteristics of the underlying distribution that lead to the realization of the observed network. Specifically, the distribution parameters that result in the observed network being the most likely network to have been formed are sought.

Following the definitions presented in Robins et al. (2007) and Lusher et al. (2013), an ERGM specifies the probability of a particular network realization g in the following way.

$$\begin{aligned} Prob(G = g) = \frac{1}{\kappa (\theta )} \exp {\left( \sum _i \theta _i z_i(g)\right) } \end{aligned}$$
(7)

The probability is given by an exponential function of parameters \(\theta \) and network attributes \(z_i\). The network attributes are selected based on the assumed conditional dependencies in the model. For example, one such dependency might be mutual ties reflecting a reciprocal relationship. In this case, the attribute \(z_{recip}\) would be equal to the total number of reciprocal ties in the network. The parameters \(\theta \) indicate the relative weight of each network attribute. In the example of reciprocal ties, a large positive parameter value would indicate that networks with many reciprocal ties are more likely and that the likelihood of an individual link forming is marginally higher if it completes a reciprocal relationship. Following the work of Frank and Strauss (1986), a homogeneity assumption is generally included with respect to the parameters and attributes. Homogeneity assumes that all linking patterns of the same type have the same effect. To illustrate, it assumes that the tendency for a reciprocal tie to form between two nodes \(n_i\) and \(n_j\) is identical to the tendency for a reciprocal tie to form between any other pair of nodes. Finally, the function \(\kappa (\theta )\) is a normalizing coefficient that insures that the distribution is a proper probability distribution.

In specifying the model, assumptions about pair-wise dependency must be made. These assumptions are incorporated by including network attributes that measure the assumed type of dependencies. Wasserman and Pattison (1996) and Lusher et al. (2013) provide extensive discussions of typical network attributes used in ERGM analysis. I describe a selected few below. In general, these network attributes can be arranged into two groups of attribute types: topological attributes and social selection attributes.

The topological attributes describe specific patterns of links within the network. Typical examples include a measure of density, degrees, triangles and triples, or reciprocal ties. Density reflects the number of links in the network, relative to the number of possible links, and indicates whether the network is generally well connected or sparsely connected. As before, node degree describes the number of nodes to which each node is connected and may convey information about the importance of certain nodes and other notions of centrality.Footnote 11 Triangles and triples describe patterns of relationships between three nodes.Footnote 12 Reciprocal ties indicate pairs of nodes that both link to one another, indicating a reciprocal relationship. The use of these types of topological attributes allows for the explicit inclusion of many different types of network dependence in ERGMs. Within the context of international trade, it allows for an explicit description of the ways in which the exports from one partner to another are affected by the other trade relationships of countries.

The social selection attributes, by comparison, are based on aspects of the network beyond the pattern of links. It is through these attributes that secondary networks or node characteristics can influence the formation of links. Common social selection attributes include measures of homophily, sender effects, and receiver effects. Homophily refers to the possibility that nodes tend to link to similar nodes with a higher likelihood. Sender and receiver effects indicate whether certain unilateral characteristics affect the number of links extending from or to a node, respectively. In the context of international trade, social selection attributes can be included to model traditional trade determinants such as GDP, common languages, preferential trade agreements, or other country specific effects.

A well-specified ERGM is one in which the set of attributes fully accounts for the expected dependencies across nodes. One of the benefits of this modeling structure is there is a considerable amount of flexibility with regard to model construction. For example, Lusher et al. (2013) and Robins et al. (2007) describe two common dependency structures. The first is a Bernoulli random graph in which all links are assumed to be independent of one another. This assumption represents what is essentially the simplest possible structure where link formation is not dependent on any other links in the network. The model itself simply specifies the set of attributes \(\mathbf {z}\) as consisting of only a measurement of the number of links in the network. The second example is a Markov graph, which incorporates more significant dependency assumptions. A Markov random graph assumes that a link between two nodes is dependent on all links connecting to or from those nodes. The set of attributes for a Markov graph typically includes the number of edges, triples or triangles, reciprocal ties, and a range of degrees of different values. In addition to these two parameterizations, contemporary ERGM models offer a wide variety of possible attributes that can be selected based on the underlying assumptions of dependence within the network being modeled.

A typical objective in ERGM analysis is the empirical estimation of the model, which is an effective means by which to draw statistical inference from network data. When estimating an ERGM, the process begins with an observed network such as the network of trade between countries for a given year. An ERGM is specified given the assumed dependencies within the model. The objective is to estimate parameter values \(\theta \) of the ERGM such that the observed network is the maximally likely network to have formed given the distribution of all possible networks. The estimated parameters provide information as to the relative importance of each attribute in the observed network and indicate the types of network relationships that are important.

In what follows, the estimation procedures described will be limited to unweighted networks. Similar work on weighted networks is arising in the literature (see, for example, Krivitsky (2012) and Desmarais and Cranmer (2012)), but is less developed than the literature and procedures for unweighted networks.

Estimation of the parameters is essentially a maximum likelihood problem. The desired estimates are those that make the observed network the most likely to be observed. One method of estimating these parameters is to use standard maximum likelihood techniques on Eq. (7). However, doing so requires the computation of the normalizing coefficient \(\kappa (\theta )\), which is contingent on the sample space consisting of all possible networks. This poses a computational problem for even relatively small networks where the magnitude of the set of all possible networks is \(2^{|N|*(|N|-1)}\) for directed networks or \(2^{|N|*(|N|-1)}/2\) for undirected networks. As such, standard maximum likelihood approaches are infeasible for even modestly sized networks.

As an alternative, Strauss and Ikeda (1990) and Wasserman and Pattison (1996) describe a modified approach that utilizes a maximum pseudo-likelihood technique. The original ERGM specification given by equation (7) can be reformulated as a logit model in terms of individual link formation. If \(x^c_{ij}\) denotes the complement of link \(x_{ij}\) (that is, the set of all other links excluding \(x_{ij}\)), \(g_{+ij}\) denotes the network g with the addition of link \(x_{ij}\), and \(g_{-ij}\) denotes the network g with link \(x_{ij}\) removed, then a logit function for the ERGM can be written

$$\begin{aligned} \ln \left( \frac{Pr(x_{ij} = 1| x^c_{ij})}{Pr(x_{ij} = 0| x^c_{ij})}\right) = \varvec{\theta '}\left[ \mathbf {z}(g_{+ij}) - \mathbf {z}(g_{-ij})\right] . \end{aligned}$$
(8)

The logit function models the log odds of individual link formation contingent on the rest of the network. By doing so, the normalizing coefficient is eliminated from the model making computation easier. Estimation of the logit function using maximum psuedo-likelihood techniques requires the computation of the change statistic \(\left[ \mathbf {z}(g_{+ij}) - \mathbf {z}(g_{-ij})\right] \), which describes how each attribute changes as a specific link is added or removed from the network, but is generally feasible. However, while maximum pseudo-likelihood estimation of this logit function has the advantage of being readily computed using standard statistical tools, it suffers from a general concern that its estimation results in biased estimates and potentially poor approximations of the standard errors (see Robins et al. (2007) and Snijders (2002)). For these reasons, maximum pseudo-likelihood estimation has largely been replaced by Monte Carlo estimation methods based on (8).

Most recent work on ERGM estimation has utilized Markov Chain Monte Carlo (MCMC) maximum likelihood estimation. A brief summary of this process is included here but Snijders (2002) and Lusher et al. (2013) provide more detailed descriptions of the methodology. On a basic level, MCMC techniques are used in order to generate a sampling distribution of networks that can then be used for statistical inference. Parameter values are proposed and the MCMC process generates a chain of network realizations with the hope that the sequence of networks converges to a distribution of networks such that the observed network is centered within the distribution and represents the most likely network that could have formed.

The process begins with the selection of initial parameter values \(\varvec{\hat{\theta }}^0\).Footnote 13 Next, an arbitrary network \(g^0\) is initialized as a starting point for the simulation process. A sequence of networks is generated through a stochastic process in which a single link \(x^t_{ij}\) is selected at random at each step along the sequence. The current network \(g^{t-1}\) is altered with respect to this one link such that the link is added if \(x^{t-1}_{ij} = 0\) or removed if \(x^{t-1}_{ij} = 1\), resulting in a new proposed network \(g^*\). The two potential ensuing networks \(g _{+ij}\) and \(g_{-ij}\) are compared and the alteration to \(x_{ij}\) is accepted if the resulting network is sufficiently likely to occur given the previous network. This process typically employs a Metropolis-Hastings algorithm in which the proposed network is evaluated according to a Hastings ratio such that the proposed network is accepted with probability

$$\begin{aligned} \min \left\{ 1, \frac{Pr_\theta (g^*)}{Pr_\theta (g^{n-1})} \right\} . \end{aligned}$$
(9)

The Metropolis Hastings algorithm accepts the proposed network if it is more likely than the previous network or—if it is less likely than the previous network—with some probability that is decreasing in the likelihood ratio. The Hastings ratio can be generated using essentially the same logit model as described above in equation (8) and is based on the initial parameter values and the resulting change statistics.

This Markov process governed by the Metropolis Hastings algorithm generates a sequence of T-many networks with the intention of creating a sampling distribution. This Monte Carlo procedure typically includes a burn-in period following the initialization of the starting network that omits the first r-many networks generated so as to eliminate any memory of the starting network. By generating the sampling distribution one link at a time, significant autocorrelation tends to arise between subsequent networks in the sequence. To mitigate this autocorrelation, MCMC procedures typically use thinning methods that only include every sth network in the sampling distribution. All other networks contained within the interval of s-many networks are excluded. Thus, the ultimate sampling distribution consists of the networks \(\{g^{r}, g^{r+s}, g^{r+2s},\ldots , g^{T}\}\) so that there is limited autocorrelation within the sequence.

Following the Monte Carlo simulation process, the resulting sample of networks is compared to the observed network in order to determine if the model and initial parameter values are a good fit. If the estimation was successful, the distribution of sample networks ought to have attribute distributions centered around the attributes present in the observed network. If this holds, the parameter values are those that make the observed network the most likely network that could have formed, thereby solving the underlying maximum likelihood problem. If, however, the sampling distribution is not acceptably centered around the observed network, alterations are made to the initial parameter values \(\varvec{\hat{\theta }}^0 \) and the process is repeated in subsequent iterations using updated sets of parameter values (\(\{\varvec{\hat{\theta }}^2\), \(\varvec{\hat{\theta }}^3, \ldots \}\)) until a satisfactory set of parameter values is found. Once an accurate set of parameter values is identified, the goodness of fit is tested by simulating a collection of additional networks using the estimated parameters and checking that they are suitably replicating the desired features of the observed network. The model and estimated parameter values are said to fit well if the networks from the simulated sample share the same characteristics on average as the observed network. For example, the average number of links or the distribution of degrees are similar.

If these diagnostic tests are satisfied, the final parameter estimates \(\varvec{\hat{\theta }}\) may be accepted and the estimation procedure is concluded. The estimates can then be used to describe dyadic dependencies within the model. The estimates themselves can be interpreted in terms of log odds as in equation (8). The log odds of a link \(x_{ij}\) forming depends on its relative position in the network. Suppose, for example, that by forming link \(x_{ij}\), the link represents an additional link, a reciprocal tie, and completes a cyclical triple. The log-odds of that link forming would be equal to \(\theta '_{links} + \theta '_{reciprocal} + \theta '_{c-triple}\). In general, the sign and magnitude of each coefficient can be used to describe the relative importance of each modeled attribute and respective dependency. Positive estimates identify the network relationships that are likely to promote link formation while negative coefficients describe those that tend to deter link formation. The magnitude of the estimates further specifies the strength of these dependencies. Thus, using this information, a more complete understanding of the interrelationships in the network can be attained.

In recent years, several popular software packages have emerged that facilitate the estimation of a wide range of ERGM specifications. Two of the most popular are statnetFootnote 14 and PnetFootnote 15. The work described in the remainder of this paper utilizes the statnet software. The statnet suite is a package written in R containing a variety of tools for analyzing networks (Handcock et al. 2003). In addition to providing powerful ERGM estimation procedures, it also includes tools to perform other network oriented tasks such as graphing procedures and the generation of network descriptors. For additional information on the use of statnet, see Goodreau et al. (2008) and Handcock et al. (2008).

3.2 ERGM estimation of international trade flows

In order to study the properties of the international trade network and its impact on trade patterns, I estimate ERGMs using bilateral trade data for the international trade networks of 1995 and 2006. The data is the same as that used for the gravity analysis described in the previous section. The ERGM methods face some practical limits compared to the gravity approaches due to the computational intensity of the estimation procedure. Because of this, two different ERGM specifications are estimated. One specification considers the full sample of countries and trade flows in each year but faces some limitations in the types of network attributes that can be included. The second specification uses a smaller sample with fewer countries, which permits the estimation of additional network dependencies. In this second case, the number of countries is reduced to a subset of the 50 largest trading countries by total exports and imports, which reflects over 90 percent of global trade in 1995 and 2006.Footnote 16 A challenge with reducing countries to the largest traders is that at the aggregate level, nearly all of these countries trade with one another, resulting in a network in which nearly all possible links are present and there is almost no variation in the network structure. To avoid this issue, I consider sectoral trade instead of aggregate trade. Specifically, I use trade flows from HS chapter 36 as a case in point, which contains explosives, pyrotechnic products, matches, pyrophoric alloys, and certain combustible preparations (hereafter referred to simply as “explosives”). This collection of goods was chosen because about 50 percent of possible trade links were present in each year, providing a moderately dense, pattern rich trade network to study. However, the approach could be equally applied other sectors. As a robustness exercise, I perform similar analyses using four other sectors. The details of these models are presented in the “Appendix” and discussed in Sect. 4. An added advantage of looking at less aggregated sectoral trade is that it more closely relates to much of the existing trade network theory, which largely describes firm-level relationships.

The dependency specifications for the full sample and partial sample are based on traditional gravity theory and take the following form.

$$\begin{aligned} \text {Full Sample:\quad } \mathbf {z} &= {} \theta _1 z_{link} + \theta _2 z_{recip} + \theta _3 z_{gdp} + \theta _4 z_{dist} + \theta _5 z_{lang} + \theta _6 z_{cntg} + \theta _6 z_{rta} + \theta _7 z_{mrt}, \end{aligned}$$
(10)
$$\begin{aligned} \text {Partial Sample:}\quad \mathbf {z} &= {} \theta _1 z_{link} + \theta _2 z_{recip} + \theta _3 z_{gdp} + \theta _4 z_{dist} + \theta _5 z_{lang}\nonumber \\&+ \theta _6 z_{cntg} + \theta _6 z_{rta} + \theta _7 z_{mrt} + \theta _8 z_{gwesp}. \end{aligned}$$
(11)

Under both specifications, the world trade network is assumed to be dependent on two topological attributes: the number of links in the network (\(z_{edges}\)) and reciprocal links (\(z_{recip}\)). The topological attributes condition the estimation on matching the expected number of trading relationships present in the network and the number of reciprocal relationships. Computational feasibility limited the inclusion of additional topological attributes when using the full sample. The partial sample, however, permitted the inclusion of an additional term, denoted \(z_{gwesp}\), which reflects the geometrically weighted edge-wise shared partner distribution (GWESP). The GWESP distribution is a method of modeling triangle patterns that reflect cases in which two trading countries share a common third trading partner. The attribute takes the form of a nonlinear distribution that captures the range and proportion of country-pairs with different numbers of shared partners. The parameters of the distribution, which are estimated, shape the patterns of shared partners across the network.Footnote 17

The specifications also assume that the trade network is dependent on several social selection attributes. Specifically, it is assumed to depend on each country’s GDP (\(z_{gdp}\)), their physical distance (\(z_{dist}\)), common languages (\(z_{lang}\)), contiguous borders (\(z_{cntg}\)), and preferential trade agreements (\(z_{rta}\)). The specifications also include information on multilateral resistance. The multilateral resistance term \(z_{mrt}\) is defined as the product of the estimated importer and exporter fixed effects for each pair of countries, which should help capture unobserved trade costs within the ERGM model. For the fixed effect values, I use the estimates from the specification presented in column (1) of Table 2.Footnote 18 The social selection attributes were selected to mirror a standard gravity model. In the case of GDP, the attribute measures whether the GDPs of the exporting and importing countries affect their likelihood of trading. If nodes with greater GDPs trade at a higher frequency in the observed trade networks, then this attribute will exhibit a positive coefficient. The remaining attributes assume that the world trade network is dependent on a series of other networks entirely. Similar to the work of Pan (2018) and Smith et al. (2019), distance, common language, contiguity, RTAs, and multilateral resistance each represent secondary networks composed of the same countries. The model assumes that the world trade network is dependent on these secondary networks such that each coefficient reflects the covariance between a link in the trade network between two countries and the presence, absence, or weight of a corresponding link in the secondary network.

The ERGM estimation results, which are presented in Table 5,

Table 5 ERGM estimates of world trade networks

further demonstrate the importance of complex trade patterns in determining bilateral trade. Columns (1) and (2) depict the estimates using the full trade networks of 1995 and 2006, respectively. Columns (3) and (4) depict those for the partial sample of explosives in 1995 and 2006, respectively. The edges term acts as a constant that determines the baseline likelihood of a link forming. The negative value of the edges term can be thought of as being reflective of baseline trade costs before accounting for other factors like distance, trade agreements, or economic size. The reciprocal term, in most cases, is positive, indicating that link formation is more likely if the countries are already connected by a link in the opposite direction. This finding is consistent with the estimates in the two gravity models. Curiously, the reciprocal term is not significant in the 2006 partial sample network. This suggests that mutual trade was not a key feature of that network and may indicate that traders of explosives have shifted towards either importing or exporting those goods but not both.

The social selection attributes largely match the standard findings from the gravity literature. The ERGMs find that that countries with large GDPs, that share a border, and that belong to a trade agreement are more likely to trade. The one exception is that the estimate for the common language term is significantly negative for the full sample network in 1995, indicating that countries with shared languages were less likely to trade within that network. The estimates for the multilateral resistance terms are also consistent with expectations from gravity. Recall that in the gravity model from which they were derived, bilateral trade is increasing in the size of the fixed effect estimates. As such, the estimates effectively represent the general inward and outward draws of each importer and exporter, respectively. The ERGM estimates find that the likelihood of a link forming is positively correlated with the combined effect of these two forces.

Finally, the GWESP terms included in the partial sample models indicate that the likelihood of forming links is monotonically increasing in the number triangles that the new link would close. Table 5 reports two estimates for the \(z_{gwesp}\) term: an estimate of the effect and an estimate of the decay, which shapes the underlying distribution. For both networks, the main estimate is positive, implying that countries are more likely to trade if they share third-party trading partners. Additionally, the positive estimates for the shape parameter imply that the marginal effect of an additional shared partner is increasing in the number of shared partners, which is not surprising given that the most significant traders tend to be connected to many markets.Footnote 19

Across the four models presented in Table 5, there are some notable similarities and differences. For the edges term, the effects are much larger for the partial sample, implying a higher baseline barrier to trade formation. With regards to reciprocal trade, the models find that it is a large, positive, and significant determinant for the full sample of aggregate trade but not for the partial sample of explosives trade, suggesting that the latter is based more on one direction relationships from producing countries to buying countries. The standard gravity variables present similarly nuanced trends. GDP and distance tended to matter more at the aggregate level than for explosives while the opposite is true for language and contiguity. The results for trade agreements are mixed with the smallest and largest impacts being on 1995 aggregate trade and 2006 aggregate trade, respectively. This may be explained by the rapid formation of new trade agreements during that time. Finally, although the estimates for multilateral resistance are larger for the full aggregate samples, these are likely due to differences in the scaling of the estimated fixed effects across the samples and should not be viewed as a definitive comparison.

The estimated coefficients, which represent log odds, can be used to generate probabilities of trade formation for each pair of countries. If the log odds of a link forming is L, the probability of formation is \(p = exp(L)/(1+exp(L))\). To illustrate using the estimates from the full sample in 1995, the log odds of a country importing from another country if they share a border, are 500 miles apart, and the new link would complete a reciprocal relationship is \(-1.444 + 0.033 - 500* 0.00008 + 2.707 = 1.256\), ignoring some of the other terms for simplicity. These odds imply a probability of link formation of about 0.77. Thus, it is clear that such conditions are highly conducive to the formation of trading relationships. To demonstrate the relative importance of reciprocity we can examine the effect of removing that characteristic. Were the link not reciprocal, the probability of formation would drop to only about 0.19. Thus it is clear that reciprocity had a considerable influence on the formation of the international trade network in 1995.

Similar to the gravity models in the previous section, the ERGM models provide strong corroborating evidence that complex network patterns and dependencies influence trade patterns. Both sets of models provide a different means by which to analyze these patterns and tend to find similar relationships within the data. In the next section, I test how these models compare when it comes to reproducing the complex network patterns of international trade.

4 Comparing gravity models and ERGMs

The previous two sections demonstrate two methods for identifying complex network dependencies in the world trade network. This section provides a comparison of these methods. To do so, estimates based on both the probit gravity models discussed in Sect. 2 and the ERGM models in Sect. 3 are used to simulate collections of trade networks. These simulated networks are then compared across several measures of goodness of fit, such as the number of shared partners between countries and the degree distributions of importers and exporters, in order to determine which model is better able to replicate the network patterns present in the actual world trade network. These comparisons demonstrate that both approaches appear to outperform one another on different measures, suggesting that both exhibit relative strengths.

Probit models and ERGM models are estimated using a common dataset. The data is the same as that used for the partial sample ERGM specifications in Sect. 3. As before, the data reflects trade in explosives under HS chapter 36 among the top fifty trading countries. The following three equations describe the specifications of the three models.Footnote 20

$$\begin{aligned}&\text {Standard probit:} \quad T_{ij} = D_{ij}\alpha + \mu _{i} + \nu _{j} + \epsilon _{ij} \end{aligned}$$
(12)
$$\begin{aligned}\text {Network probit:} \quad T_{ij} &= \delta + D_{ij}\bar{\alpha } + \bar{\beta _1} RECIP_{ij} + \bar{\beta _2} TRAN_{ij} + \bar{\beta _3} CYLC_{ij} \nonumber \\&\quad \; +\beta ^4 IID_{ij} + \beta ^5 IOD_{j} + \beta ^6 EID_{i} + \beta ^7 EOD_{ij} \nonumber \\&\quad + \sum _{k\in i,j}\left( \gamma _k^1 ln(GDP_{k}) + \gamma _k^2 ln(GDPPC_{k}) + \gamma _k^3 ln(REMT_{k}) \right) + \epsilon _{ij}\end{aligned}$$
(13)
$$\begin{aligned}&\text {ERGM:} \quad T_{ij} = f(z_{edges}, z_{reciprocal}, z_{gdp}, z_{dist}, z_{lang}, z_{cntg}, z_{rta}, z_{mrt}, z_{gwesp}) \end{aligned}$$
(14)

Two different probit models and one ERGM are considered. All three models are restricted to a cross-section for each of the two years of networks—1995 and 2006—in order to maintain parity in the amount of information supplied to each model. The first probit model follows the standard gravity specification described in Sect. 2 and presented in column (1) of Table 4. This simple model includes the standard vector of gravity variables (\(D_{ij}\)), exporter fixed effects (\(\mu _{j}\)), and importer fixed effects (\(\nu _{i}\)). The second probit model reflects the network probit specification presented in column (6) of Table 4. This specification includes the set of network covariates from before and replaces the country fixed effects with proxies and a constant (\(\delta \)). Additionally, the coefficients \(\bar{\alpha }\), \(\bar{\beta ^1}\), \(\bar{\beta ^2}\), and \(\bar{\beta ^3}\) were constrained to the values derived in a specification that included country fixed effects, like that depicted in column (2) of Table 4. For each of the probit models, a series of regressions were undertaken to estimate the coefficients for each of the two models and sample years. The results of those estimations are available in Table 9 in the “Appendix”.Footnote 21 The ERGM specification is unchanged from the previous section and uses the model estimates from columns (3) and (4) of Table 5.

The three models can be used to produce estimated probabilities of link formation for each pair of countries. These probabilities can then be can be used to simulate trade networks based on each model. In order to compare the three models’ abilities to replicate complex network dependencies, each model is used to simulate a sample of 100 trade networks. Using each collection of simulated networks, sample statistics for several common network patterns are derived from the set of simulated networks. Specifically, the comparison evaluates the number of edges formed in the network as well as the distributions of geodesic distances, in-degrees, out-degrees, and edge-wise shared partners in the simulated networks and compares them to those present in the observed trade network. These five types of attributes are considered because they cover the major types of network patterns considered in the literature. Geodesic distance reflects the minimum distance between two nodes in terms of links. It ranges from 1, in which the nodes are directly connected by a link, to infinite, in which there is no path of links connecting the two nodes. In-degree and out-degree both reflect the number of links flowing into or out of a node. Their distributions give a measure of whether nodes in the network tend to have high degrees, low degrees, a wide range of different degrees, or all very similar degrees, for example. Edge-wise shared partners reflect the frequency with which two connected nodes are also connected to the same third-party nodes, similar to the triangle, triple, and GWESP patterns described in previous sections. A high number of shared partners indicates that both nodes have many of the same partners while a low value indicates that they share few common partners. The distribution of edge-wise shared partners provides a sense of how these patterns are distributed throughout the network. The three models and specifications are compared by assessing how well each reproduces the real world values and distributions of these five network characteristics.

To make the comparisons, which are often across many values in a distribution, I use measures of integrated squared errors (ISE) to compare each of the five characteristics.Footnote 22 The ISE measures quantify the differences between the observed distributions and the simulated distributions. They are defined as \(M = \sum _h (f(h) - g(h))^2\) where f(h) is the attribute value from the observed network, g(h) is the mean value from the simulated networks, and h indexes the different attributes in the distribution.Footnote 23 For example, in the case of geodesic distance, h reflects each possible distance from 1 to infinity. In the case of \(h = 2\), representing cases in which countries are two links apart, the observed network of 2006 featured 1,115 such pairs and the ERGM simulated networks featured an average of about 1,251 such pairs. Therefore, \(h = \text {two links}\), \(f(\text {h}) = 1115\), and \(g(\text {h}) = 1251\). Squaring the difference and summing with the other geodesic distance lengths gives its ISE measure for the ERGM model. The ISE measures for all five types of attributes and all three models are presented in Table 6.

The ISE measures indicate that each of the three models performs better than the other two at certain attributes and for certain observed networks. For each attribute, a smaller value indicates a better fit. The ERGM model substantially outperforms the two probit models in terms of the edges characteristic, which reflects the model’s ability to capture the right number of trading relationships in the network. The ERGM models also tends to perform relatively well at replicating the distribution of shared partners, providing the best fit in the 1995 network and the second best fit in 2006. This is likely influenced by the fact the the ERGM models explicitly capture shared partners and third party trade in their specification via the GWESP term. The two probit models outperform the ERGM model at reproducing the in- and out-degree distributions of the networks, implying that typical gravity models using country-level controls do a good job at capturing each country’s proclivity to trade with others. The standard probit model outperforms both other models at replicating geodesic distance in both networks. The network probit model does not consistently outperform the others on any characteristic but often falls between the standard probit and ERGM models, suggesting it may be a feasible hybrid approach that provides a mix of the strengths of both. Interestingly, the explicit inclusion of degree information in the network probit model did not significantly improve its ability to replicate the two degree distributions. This suggests that the fixed effects, and likely MRTs as an extension, are already effective at capturing these influences.

As additional robustness exercises, I repeat the preceding analysis using four additional networks. These networks differ in several dimensions. They reflect multiple different product types and levels of aggregation: coffee (HS heading 0901), an agricultural commodity; cork (HS chapter 45), a group of natural resource products; wool (HS chapter 51), a group of textile products; and cars and other transport vehicles (HS heading 8703), a group of complex manufactured products. They also represent a range of network densities from 31 percent (cork) to 76 percent (cars). For the sake of parsimony, the full presentation of the analysis can be found in the “Appendix”.

The four additional comparisons highlight the nuances in the main findings. Most notably, they demonstrate that the relative performance of the three approaches depends on the network. For wool and cars, the ERGM often outperforms both of the probit models in many of the same ways it does with the explosives networks. The network probit model also often outperforms the other two with regards to certain patterns. However, for the cork and coffee networks, the standard probit model outperforms the other two in all categories. Based on these findings, there may be a relationship between the density of the network and the performance of each approach. The two network-intensive approaches—the ERGM and network probit model—perform better with the denser of the networks considered. By comparison, the standard probit model performs best with the two least dense networks. Finally, due to computational challenges, several of the ERGM models could not be estimated with all of the terms listed in Eq. 14. However, the performance of the ERGM models does not appear to be severely hampered by the inability to include certain terms like the shared partner distribution. The networks for which the ERGM tended to perform best were also those that did not include some terms, suggesting that the usefulness of ERGMs is not completely tied to the number of terms that are included.

Together, the model comparisons suggest two primary findings. First, they demonstrate that there are components of the trade network that may not be captured in a gravity model as well as they may be captured by an empirical network approach. This finding provides additional motivation for the recent literature focusing on network approaches for analyzing trade. The ERGM is one such model that can provide some advantages over gravity approaches when modeling certain patterns in trade. An analysis concerned with these aspects of trade ought to consider such an approach. This is particularly true for analyses in which the prediction of zero trade is important. Second, the findings also reiterate the power of the gravity model. Even though the gravity model consolidates most of the information about the world trade network into the MRTs, it is still relatively effective at capturing and replicating complex network dependencies in many cases. Further, the inclusion of network terms in gravity models can improve its performance with respect to certain types of patterns. These findings complement much of the recent structural gravity research, such as that by Anderson and Yotov (2012) and Fally (2015), which repeatedly demonstrates the power of MRTs and their fixed effect counterparts. Inconveniently, there does not appear to be a singularly “best” method for modeling complex network patterns in trade as each method considered here presents strengths and weaknesses. However, this underscores the need for a greater focus on the modeling of network patterns.

5 Conclusion

The role of network dependencies in international trade is an important part of understanding the determinants of bilateral trade. Prior research has consistently indicated that the trade between two countries is influenced by a wide variety of relationships that these countries share not only with each other but with all other countries. While most traditional trade research has overlooked network dependencies, recent advances in empirical trade and network analysis are beginning to allow for the inclusion of these significant trade determinants. This paper describes two such methods for doing so using gravity and ERGM techniques.

By viewing international trade as a network formation problem that is dependent on underlying characteristics of the network, statistical inference is possible. The series of gravity and ERGM estimations described in this paper provide strong evidence that complex network patterns influence trade. In particular, they indicate that reciprocity, common third-party trading partners, and the set of countries with which each partner trades are significant determinants of bilateral trade. These findings are consistent with past research and provide additional support for several recent theoretical models that incorporate network dependencies. Further, both approaches described within offer differing advantages in terms of which types of networks patterns they capture and can accurately replicate. These methods could be useful for modeling the extensive margin of trade, such as in counterfactual gravity applications in which there is an interest in predicting zero trade flows.

As evidenced by this paper, complex network patterns have an important influence on trade. The methods described here provide a framework through which to continue studying these dependencies. While structural gravity models with MRTs are relatively effective at capturing many of these dependencies, the inclusion and identification of other types of network dependencies represents an important avenue for future research, particularly in cases where the absence of trade is important. This research will provide valuable insight into how traders select their partners amid a complicated network of existing relationships. ERGMs, in particular, offer a useful and flexible means by which to study these relationships.

Table 6 Goodness of fit tests measuring the ability of three models to replicate observed trade patterns