1 Introduction

The International Trade Network (ITN), aka World-Trade Web (WTW) or World Trade Network (WTN), is defined as the graph representing in each year the web of bilateral-trade relationships between countries in the World. The statistical properties of the ITN, and their evolution over time, have been recently received a lot of attention in a number of contributions.Footnote 1

Understanding the topology of the ITN is important for two related reasons. First, trade is one of the most important channels of interaction among countries (Helliwell and Padmore 1985; Krugman 1995; Galvandatilde et al. 2007; Forbes 2002). The knowledge of macroeconomic phenomena such as economic globalization and internationalization, the spreading of international crises, and the transmission of economic shocks, may be improved by looking at international-trade patterns in a holistic framework, where indirect as well as direct linkages between countries are explicitly taken into consideration (Fagiolo 2010).Footnote 2 Second, ITN topological properties can help to statistically explain macroeconomics dynamics. For example, Kali et al. (2007) and Kali and Reyes (2010) have shown that country position in the trade network has substantial implications for economic growth and a good potential for predicting episodes of financial contagion. Furthermore, Reyes et al. (2010) suggest that country centrality in the ITN may help to account for the evolution of international economic integration better than what standard statistics, like openness to trade, do.

The statistical properties of the ITN, in its undirected/directed or binary/weighted characterizations, have been extensively studied and today we know a great deal about the topological architecture of the web of international-trade flows. For example, Serrano and Boguñá (2003) and Garlaschelli and Loffredo (2004) show that the binary-directed representation of the ITN exhibits a disassortative pattern: countries with many trade partners (i.e., high node degree) are on average connected with countries with few partners (i.e., low average nearest-neighbor degree). Furthermore, partners of well connected countries are less interconnected than those of poorly connected ones, implying some hierarchical arrangements. Remarkably, Garlaschelli and Loffredo (2005) show that this evidence is quite stable over time. This casts some doubts on whether economic integration (globalization) has really increased in the last 30 years. Furthermore, node-degrees appear to be very skewed, implying the coexistence of few countries with many partners and many countries with only a few partners.

These issues are taken up in more detail in a few subsequent studies adopting a weighted-network approach to the study of the ITN. The motivation is that a binary approach, by treating all relationship equally, might dramatically underestimate the impact of trade-linkage heterogeneity. This seems indeed to be the case: Fagiolo et al. (2008, 2009, 2010) find that the statistical properties of the ITN viewed as a weighted undirected network crucially differ from those exhibited by its binary counterpart. For example, the strength distribution is highly right-skewed, indicating that a few intense trade connections co-exist with a majority of low-intensity ones. This confirms the results obtained by Bhattacharya et al. (2007, 2008), who find that the size of the group of countries controlling half of the world’s trade has decreased in the last decade. Furthermore, weighted-network analyses show that the ITN architecture has been extremely stable in the 1981–2000 period and highlights some interesting regularities (Fagiolo et al. 2009). For example, countries holding many trade partners and/or very intense trade relationships are also the richest and most globally central; they typically trade with many partners, but very intensively with only a few of them, which turn out to be very connected themselves; and form few but intensive-trade clusters (i.e., triangular trade patterns).

Most of existing network literature on the ITN, however, has been focusing on a purely empirical quest for statistical properties, largely neglecting the issue of exploring whether theoretical models are able to explain why the ITN is shaped the way it is.Footnote 3

This paper is a preliminary attempt to fill this gap. We extend the work in Fagiolo (2010) to ask whether the gravity model (GM) can provide a satisfactory theoretical benchmark able to reproduce the observed architecture of the ITN across time. The GM (van Bergeijk and Brakman 2010) aims at explaining international-trade bilateral flows using an equation obtained as the equilibrium prediction of a large family of micro-founded models of trade (more on that in Sect. 2). The term “gravity” comes about because the predicted relation between trade flows and explanatory variables is similar to Newton’s formula: the magnitude of aggregated trade flows between a pair of countries is proportional to the product of country sizes (e.g. the masses, as proxied by country GDPs) and inversely proportional to their geographic distance (interpreted as proxies of trade-resistance factors, e.g. tariffs). From an econometric perspective, the original model-driven prediction can be augmented with a set of country-specific explanatory variables (e.g., population, area, land-locking effects, etc.), as well as with a set of bilateral variables (i.e., geographical contiguity, common language and religion, colony relation, bilateral trade agreements, etc.). The GM can be fitted to the data using different econometric techniques, ranging from simple ordinary least squares (OLS) applied to the log-linearized equation, to zero-inflated two-stage non-linear estimation, employed to correctly deal with the large number of zero trade flows characterizing the data. Overall, the GM is very successful: independently on the technique employed, it typically achieves a very high goodness of fit, e.g. in terms of R-squared coefficients.

Motivated by the well-known empirical success of the GM, we fit data on bilateral-trade flows to estimate GM-predicted weighted-directed representations of the ITN, which we then compare to the observed one, constructed using original bilateral-flow data. We employ both a static and a dynamic approach. In the static approach, we assume that a GM holds in each subsequent year and we estimate a series of predicted ITN snapshots. In the dynamic approach, we control for time dummies in the estimation to account for change over time and get a unique predicted ITN from the unbalanced panel of predicted flows. In both cases, we end up with estimates for both the probability that a link is present and for the probability of any given level of bilateral-trade flow occurring between any two countries in a given year (given that a link is present). We complement this information with standard errors of estimated quantities, so as to evaluate the precision of GM-based predictions.

In a nutshell, our results suggest that a necessary (but not sufficient) condition for the GM to well predict weighted ITN properties is to fix the binary structure equal to the observed one. Even if one conditions trade-flow estimation to the true binary architecture, the GM may badly predict higher-order statistics that, like clustering, require the knowledge of triadic link-weight topological patterns. Finally, the performance of the GM is very poor when asked to predict ITN weighted properties together with its binary architecture, or when one employs a GM specification to estimate the presence of a link only.

The rest of the paper is organized as follows. Section 2 discusses the gravity model and presents data and related methodologies. Our main results are reported in Sect. 3. Finally, Sect. 4 concludes and flags some of the challenges facing ITN modeling in the future.

2 Data and methodology

2.1 Bilateral trade-flow data

We use international-trade data taken from Subramanian and Wei (2003), which contains aggregate bilateral imports reported by the IMF Direction of Trade Statistics, measured in US dollars and deflated by US Consumer Price Index at 1982–1983 prices. We focus on seven unbalanced cross-sections for the years 1970–2000, with a 5-year lag. Let \(w_{ij}(t)\) be exports from country \(i\) to country \(j\) in year \(t\) and let \(N(t)\) the correspondent number of countries reporting at least a positive flow.

Table 1 summarizes some descriptive statistics. The number of participating countries and average per-country trade both increase over time. Entry of new countries in the database may be possibly caused either by the availability of new data or by the actual entry of the country in international-trade markets. New trade links, however, seem to increase more than quadratically with the number of participating countries in the last part of the sample, as testified by the rising density.Footnote 4 Note also that the number and percentage of countries making up 50 % of total trade seem to remain stable across the years, hinting to a stable core of top traders. Conversely, the percentage of countries controlling 90 % of total world trade has substantially decreased. The concentration process going on in the ITN, despite globalization and international integration, is confirmed also by the decrease in both the number and percentage of flows making up a certain share of total trade.

Table 1 Subramanian and Wei (2003) Database. Summary statistics

Given \(w_{ij}(t)\) and \(N(t)\), we build weight matrices for the correspondent observed trade networks. More precisely:

Definition 1

(Observed Weighted ITN) The observed weighted International Trade Network in a given year \(t\) is represented by a weighted-directed graph, where the nodes are the \(N(t)\) countries and link weights are fully characterized by the \(N(t) \times N(t)\) asymmetric matrix \(W(t)\), with entries \(w_{ij}(t)\), i.e. exports from country \(i\) to country \(j\).

Similarly, one can define the observed binary ITN, where links represent import-export partnerships, as:

Definition 2

(Observed Binary ITN) The observed binary International Trade Network in a given year \(t\) is represented by a binary-directed graph, where the nodes are the \(N(t)\) countries and binary links are fully characterized by the \(N(t) \times N(t)\) asymmetric adjacency matrix \(A(t)\), with entries \(a_{ij}(t)=1\) if and only if \(w_{ij}(t)>0\), i.e. exports from country \(i\) to country \(j\) are strictly positive.

This database has been studied from a binary/weighted network perspective in De Benedictis and Tajoli (2011) and Abbate et al. (2012). They show that international integration in trade has been increasing over time, but it is still far from being fully accomplished. Indeed, a strong heterogeneity in the profiles of across-country trade partnerships does emerge. This has important implications for both the role of regional trade agreements (i.e., the WTO) and the interplay between extensive and intensive margins of trade (Felbermayr and Kohler 2006). Furthermore, ITN properties are very sensitive to geographical distance, i.e. the correlation structure between network statistics may change when one considers links connecting countries separated by an increasing geographical distance.

In this paper, we take an alternative approach. We characterize the topological properties of the observed ITN and we compare them to the properties of GM-based estimates of the ITN structure, which we define in the next sub-sections.

2.2 Gravity-model specifications

The GM, independently proposed by Tinbergen (1962) and Pöyhönen (1963), is the workhorse model to explain bilateral trade flows among countries as a function of import and export market sizes (i.e., GDP) and trade-resistance factors, proxied by geographical distance. The GM derives its name from the functional form linking trade to size and distance, which resembles the expression for the attraction force between two bodies derived by Isaac Newton in classical mechanics. Thus, in analogy with the physics law, it is expected that trade flows increase with the product of some power of country sizes and decrease with some power of geographical distance.

This empirically-inspired law has been found to be consistent with a number of theoretical foundations (Anderson 1979; Bergstrand 1985; Deardorff 1998; Anderson and van Wincoop 2003). In other words, many possibly-conflicting micro-foundations can generate as their equilibrium outcome some gravity-like relation between trade, market sizes and trade-resistance terms.Footnote 5 For example, a gravity-like equation can be derived in trade specialization models, monopolistic-competition frameworks with intra-industry trade, or Hecksher-Ohlin models (see Fratianni 2009; De Benedictis and Taglioni 2011, for comprehensive surveys).

Notwithstanding the preferred micro-founded explanation, modern empirical interpretations of the gravity expression generalize the original idea including in the formulation a list of additional explanatory variables, covering aspects related to geography, culture, bilateral trade agreements, among others.Footnote 6 In Table 2 we report the list of explanatory variables that, following existing literature (see, e.g., Glick and Rose 2001; Rose and Spiegel 2002), we employ in our exercises. GM explanatory variables can be typically grouped in country- or link-specific ones. The former include, in addition to GDP, other country-size proxies like population and geographical area, as well as geographically-related aspects controlling for land-locking effects and continent membership. The latter instead include relational variables characterizing bilateral relationships, as geographical contiguity, colonial ties, regional trade agreements, commonalities in language, colonial history, religion, and currency. Together, these factors have been shown to successfully explain, in a way or in the other, international-trade flows in gravity-equation econometric exercises (van Bergeijk and Brakman 2010).

Table 2 Variables employed in the gravity-model estimation

The most general GM specification that we employ in what follows then reads:

$$\begin{aligned} w_{ij}(t)&= \alpha _0 Y_i(t)^{\alpha _1} Y_j(t)^{\alpha _2} d_{ij}^{\alpha _3} \left[ \prod _{k=1}^K X_{ik}(t)^{\beta _{1k}} X_{jk}(t)^{\beta _{2k}} \right]\nonumber \\&\times \exp { \left( \sum _{h=1}^H \theta _{h}D_{ijh}(t) + \sum _{l=1}^L (\delta _{1l}Z_{il}+\delta _{2l}Z_{jl}) \right) } \eta _{ij}(t), \end{aligned}$$
(1)

where \(t\) is the year \((t=1950,1955,\ldots ,2000);\, w_{ij}(t)\) are export flows from the observed weighted ITN; \(i,j=1,\ldots ,N(t),\, i \ne j;\, Y_h(t)\) is year-\(t\) GDP of country \(h=i,j\, (i=\text{ exporter};\, j=\text{ importer});\, d_{ij}\) is geographical distance; \(X_{h}(t),\, h=i,j\), are additional country-size effects (area and population); \(D_{ij}\) is a vector of bilateral-relationship variables (contiguity, common language, past and current colonial ties, common religion, common currency, a dummy to control if both countries share a generalized system of preferences, and a regional trade agreement flag); \(Z_{i}\) and \(Z_{j}\) are country-specific dummies (controlling for land-locking effects and continent membership); finally, \(\eta _{ij}(t)\) are the errors (whose mean conditional to explanatory variables obeys \(E[\eta _{ij}(t)|\cdot ]=1\)).

Two remarks are in order. First, note that expected trade flows in standard GM specifications are always positive, i.e. one assumes that any pair of countries does trade on average. Note that, in reality, zero trade flows are quite frequent in the data, either because of missing values or because any two countries are not trade partners (see Table 1). On the contrary, such zero entries cannot be recovered from the deterministic, non-linear functional form employed to define the conditional mean of bilateral trade flows, and must therefore be accounted for using an appropriate probabilistic model. Indeed, log-linearizing Eq. (1) and applying a standard OLS fit means de facto excluding observed zero-trade flows, which may lead to strong estimation biases. We shall get back to this point in the next section.

Second, and more importantly, we employ a GM specification that slightly differs from Anderson and van Wincoop (2003) one, which is one of the most commonly used in GM exercises. Anderson and van Wincoop (2003) introduce multilateral resistance terms and importer-exporter fixed effects. Formally, that approach considers that the constant term of the Eq. (1) must be generalized to a set of importer and exporter dummies. One important implication is that country-size effects are captured by country dummies. This means that characteristics of exporters and importers cannot be generalized (Santos Silva and Tenreyro 2006). In any case, all our results are robust to Anderson and van Wincoop’s specification. We have therefore chosen to retain the traditional specification because of its more immediate empirical interpretation.

2.3 Estimation

Estimation of Eq. (1) is not easy. A straightforward approach consists in log-linearizing the GM specification and apply standard OLS techniques to estimate parameters and obtain predicted values. The existing empirical literature on GM has largely employed this approach (cf. for example Glick and Rose 2001; Rose and Spiegel 2002).

However, a series of more recent contributions highlighted the risk of biases in estimation induced by OLS applied to log-linear specifications. The main sources of bias come from the treatment of zero-valued flows (Santos Silva and Tenreyro 2006; Linders and de Groot 2006; Burger et al. 2009), non-linearity and heteroscedasticity (Santos Silva and Tenreyro 2006), endogeneity and omitted-term (Baldwin and Taglioni 2006). In particular, the issue of zero-flow treatment is particularly relevant to our analysis. Indeed, log-linearizing the GM equation and applying OLS estimation implies using only non-zero trade flows in the estimation.

From a network perspective, log-linearizing Eq. (1) means that we are keeping fixed the observed binary structure (i.e. we are conditioning on adjacency matrices \(A(t)\)). This is a strong assumption if one wants to employ the GM to estimate the weighted ITN structure, as we are asking the model simply to estimate flows and not the presence/absence of links, i.e. the binary structure. More generally, one would like a model that is able simultaneously to predict both the presence of a link and its weight.

Given these well-known limitations of OLS, in this work we shall resort to count-data analysis (Long 1997) and fit to the data Poisson pseudo-maximum likelihood models (PMML), either in their standard formulation (Santos Silva and Tenreyro 2006) or in zero-inflated specifications (Linders and de Groot 2006).

In a nutshell, PPML models allow to estimate Eq. (1) in its original non-linear form, thus avoiding possible correlation between errors and regressors. In what follows, we will employ two estimations strategies as far as standard PPML estimation is concerned. In the first one, we will conservatively assume that the binary structure is given, i.e. we shall fit a GM using a PPML estimation using positive trade flows only (we shall refer to this case as the “restricted PPML”). This allows us to overcome the issues typically arising with log-linearized OLS estimation, while still being able to evaluate the performance of the GM when the binary structure is not estimated but is taken to be equal to the observed one. The second strategy performs a PPML estimation on both positive and zero flows (the plain or “unrestricted PPML” case).

PPML estimations use a Poisson distribution to model simultaneously the probability of a zero flow and of a positive (integer) flow. However, it has been noticed that, in the case of international trade, zero flows occur much more frequently than a plain Poisson model would predict (Burger et al. 2009), cf. also Table 1. This has led to the family of zero-inflated (ZI) models (Winkelmann 2008). The underlying idea is to model the presence of zeros and positive values as a two-stage process. In this way one treats differently the process of presence-absence of trade partnerships from link-weight determination. In the first stage, one estimates zero-flow probabilities using a standard logit model, and employing a series of regressors that often coincide with those used in the standard GM formulation. In the second stage, conditionally to having non-zero flows, one fits the magnitude of trade-flow values using either a Poisson (ZIP) or a negative-binomial (ZINB) distribution. Notice that in the second stage there is a non-zero probability of having a zero flow, as the process governing link-weight value determination may attach a zero flow independently on what the first process has done.

For robustness purposes, we have therefore compared results of the following estimation procedures: (i) restricted PPML (rPPML); (ii) plain or unrestricted PPML (uPPML); (iii) ZIP; and (iv) ZINB. Furthermore, in order to control for dynamic effects, we have estimated Eq. 1 using both a cross-section perspective (i.e., fitting a separate model for each of the 7 waves we end up with in our database) and an unbalanced panel-data approach (i.e., adding time dummies and estimating once and for all the entire data set). We have also controlled for country fixed effects as suggested in Baldwin and Taglioni (2006). Our results turn out to be very robust to all these alternatives. Consequently, in order to avoid redundancy, we report only results from three sets of models (rPPML, uPPML and ZIP), where a sequence of independent cross sections is estimated without country fixed effects. It must also be noticed that the second stage of the ZIP process coincides with the rPPML estimation, as one fits a Poisson model to non-zero flows only using a pseudo-maximum likelihood method.Footnote 7 By doing so, we are able to compare a setup where the binary structure of the ITN is kept fixed (rPPML) with two alternative setups (uPPML and ZIP) where instead one estimates the probability that a link is in place or not, together with the probability that the weight of a link attains any given value.

Table 3 presents estimation results for year 2000 (similar results hold also for the remaining years) using uPPML and the two stages of the ZIP model (the second one being equivalent to rPPML). Note that, by and large, both signs and orders of magnitude of estimated coefficients do not change with the estimation technique employed and have the expected signs. GDP elasticities tend to be larger than in other studies as we explicitly consider population and area as additional size effects (entering with a negative sign). This hints to a relevant effect played by per-capita GDP. Furthermore, variables as contiguity, common language, and regional trade agreements enhance trade. In contrast, colony-related variables, common religion and common currency are less statistically significant.Footnote 8

Table 3 GM estimation. Year: 2000

Note also that in (first-stage) logit estimation of the ZIP method, GDP (resp. distance) negatively (resp. positively) affect the probability of having unlinked countries, as expected. Conversely, distance or land-locking effects enhance the probability of missing links. Contiguity coefficient is instead positive: after controlling for geographical distance, sharing a border does not influence the emergence of bilateral trade. This is however a result that does not hold robustly over all cross-sections, where contiguity does not affect significantly the estimated probability.

Finally, all diagnostic statistics indicate that the estimated models are well-specified (Wooldridge 2001) and achieve a quite good (pseudo) \(R^2\). This is true over all the years, as Table 4 shows.

Table 4 Goodness of fit of the GM under different estimation techniques in four selected years

2.4 The predicted weighted ITN

As long as \(Y_i(t), \,d_{ij}\) and \(X_{ik}(t)\) are strictly positive for all \((i,j)\) and \(t\), one can rewrite Eq. (1)Footnote 9 as:

$$\begin{aligned} w_{ij}=\exp \{x_{ij}\cdot \gamma \}\eta _{ij}, \end{aligned}$$
(2)

where \(x_{ij}\) are logged country-specific and bilateral explanatory variables, and \(\gamma \) is the vector of all coefficient to estimate.

Given estimated coefficients, we use the probability distributions implied by the estimation procedure to compute GM-based predictions for the binary and weighted ITN. More formally:

Definition 3

(Predicted Weighted ITN) The predicted weighted International Trade Network, for each given cross-section \(t\), is represented by the asymmetric matrix \(\hat{W}^M\), whose generic entries \(\hat{w}_{ij}^M\) are independent random variables with a probability distribution implied by the correspondent estimation technique \(M\) employed, where \(M\in \{rPPML,uPPML,ZIP\}\).

Note that the random variables \(\hat{w}_{ij}^M\) are independent but not identically distributed, as their parameters typically vary across pairs of countries because they are computed using the same set of estimated coefficients but controls that are country and pair specific. Furthermore, since the predicted weighted ITN is a matrix of independent random entries, we sampled a sufficiently large number of times from the correct distributions in order to have a satisfying approximation of our predictions.Footnote 10

For example, in the case of the rPPML, we use second-stage ZIP coefficient estimates to compute, for each non-zero link \((i,j)\) in the observed ITN, the mean of the associated Poisson distribution. We then sample from such Poisson distributions, independently across the existing links, the associated trade flow in levels. This allows us to build a large sample of predicted weighted ITN networks, all having in common the same binary network architecture (equal to the observed one).

We then repeat the exercise in the case of uPPML. Here we fit all positive and zero flows using a PPML estimation procedure and we end up with estimated parameters for Poisson distributions describing (independently across the links) the probability that a given link may have a weight \(w\ge 0\). This generates a large sample of predicted weighted ITN, each one having an underlying binary matrix that is possibly different from the observed one.

Finally, we build a sample of predicted weighted ITNs using a ZIP estimation technique. Here, we exploit the first-stage Logit estimation to first simulate the binary structure. This is done by independently drawing a link from a Bernoulli distribution, where the probability of a non-zero flow equals the predicted probability of the Logit first-stage model. In the second stage, conditionally on having drawn a link \((i,j)\), we employ PPML coefficient estimates and the related Poisson probability distributions to draw a trade flow.

As already mentioned, the implied “predicted binary ITN” changes with the method employed. If we use a rPPML estimation technique, it will coincide with the observed binary ITN in all simulated instances. Otherwise, it will be in general different from the observed one across different samples, being an instance of a set of independent Bernoulli random variables.

Note also that, as far as PPML specifications are concerned, the expected value of trade flows is defined as the exponential of the linear prediction:

$$\begin{aligned} \hat{w}_{ij}^{PPML}=\exp \left\{ x_{ij}\cdot \hat{\gamma }^{PPML}\right\} , \end{aligned}$$
(3)

where \(\hat{\gamma }^{PPML}\) is the estimated value of \(\gamma \) in the Poisson model. In this case, the variance of predictions is equal to their expected value (either restricted or not). In the ZIP case, the expected bilateral flow (in levels) is instead defined as:

$$\begin{aligned} \hat{w}_{ij}^{ZIP}=(1-\hat{\psi }_{ij})\exp \left\{ x_{ij}\cdot \hat{\gamma }^{ZIP}\right\} =(1-\hat{\psi }_{ij}) \hat{\mu }_{ij}, \end{aligned}$$
(4)

where \(\psi _{ij}\) is the probability that a link \((i,j)\) is zero. The variance of the prediction is given by:

$$\begin{aligned} Var(\hat{w}_{ij}^{ZIP}|x_{ij})=\hat{\psi }_{ij}(1-\hat{\psi }_{ij})[1+\hat{\psi }_{ij}\cdot \hat{\mu }_{ij}]. \end{aligned}$$
(5)

In what follows, we will ask whether the predicted weighted ITN is characterized by topological properties that are similar to those of its observed weighted counterpart.Footnote 11 Section 3.2 discusses instead whether Logit estimations can well reproduce the topological properties of the binary observed ITN.

2.5 Network statistics and confidence intervals

We study the extent to which the architecture of the observed ITN over time can be explained by the GM employing a set of standard topological properties (i.e., network statistics), see Fagiolo et al. (2009) for a discussion. As Table 5 shows, we focus on three families of properties. First, total node-degree and total node-strength measure, for binary and weighted networks respectively, the number of node partners and total trade intensity. In a directed network, one can also distinguish between node in-degree/in-strength (i.e., number of markets a country imports from, and total imports) and node out-degree/out-strength (i.e., number of markets a country exports to, and total exports).

Table 5 Binary and weighted topological properties of the ITN

Second, total average nearest-neighbor degree (ANND) and strength (ANNS) compute, respectively, the average number of trade partners and total trade value of trade partners of a given node. This gives us an idea of how much a country is connected with other very well-connected countries. ANND and ANNS statistics can be disaggregated so as to account for both import/export partnerships of a country, and import/export partnerships of its partners. More precisely, one can compute four different measures of average nearest-neighbor degree/strength, obtained by coupling the two ways in which a node A can be a partner of a given target country B (importer or exporter) and the two ways in which the partners of A may be related to it (as exporters or importers). Finally, we consider clustering coefficients (CCs), see Fagiolo (2007) for a discussion. In the binary case, a node overall CC returns the likelihood that any two trade partners of that node are themselves partners. In the weighted case, these likelihoods are computed taking into account link weights to proxy how strong are the edges of the triangles that are formed in the neighborhood of a node. Again, in the directed case one can disaggregate total node CC according to the four different shapes that directed triangular motifs can exhibit.Footnote 12

In general, we are interested not only in how node average and standard deviation of the foregoing statistics change over time, but also in the way node statistics correlate, and how such correlation patterns evolve across the years. In particular, we focus on correlation between node degrees (resp., strengths) and ANND (resp., ANNS). This gives us information on the assortativity/disassortativity nature of the ITN. We are also interested in the correlation between ND/NS and clustering, to understand the extent to which more and better connected countries trade with partners that trade a lot between them, i.e. heavy triadic relations get formed.

Given any statistic computed on the observed ITN, we aim at understanding whether the predicted weighted ITNs display statistical properties that are similar to those of their observed weighted counterparts. Therefore, for any given statistic \(\sigma \) (e.g., node averages or correlations), year, and estimation method, we simulate a large number of times the associated predicted weighted ITN, computing \(\sigma \) on each simulated instance.Footnote 13 Finally, we calculate the sample average of \(\sigma \) across simulations, and 95 % sample confidence intervals. If in a given year the value of \(\sigma \) for the observed ITN lies within its confidence intervals for the predicted ITN, we can safely conclude that the GM successfully replicates that particular feature of the topological structure of the ITN.

3 Results

This section explores the question whether the statistical properties of the predicted ITN are similar to those observed in the real-world ITN. We start with basic (non-directed) weighted statistics (total NS, ANNS and clustering). Next, we discuss results related to directed weighted measures (e.g., in and out strength, etc.). Finally, we focus on the binary ITN.

3.1 Weighted statistics

We begin to study predicted population averages of total node strength:

$$\begin{aligned} \widehat{\overline{NS}}_{tot}^{M}=\frac{1}{N\cdot H}\sum _{h=1}^{H}\sum _i{\widehat{NS}_{i,tot}^{M}(h)}=\frac{1}{N\cdot H}\sum _{h=1}^{H}\sum _i\sum _{j}\hat{w}_{ij}^{M}, \end{aligned}$$
(6)

where \(N\) is the number of countries in the target cross section and \(H\) is the simulation sample size.

Note that \(NS_i^{tot}\) measures total country trade. Therefore its population average equals total world trade divided by the number of countries. Figure 1 reports predicted values against observed ones over the years. It is easy to see that PPML-based methods perfectly match observed values, with very narrow prediction errors. The ZIP procedure instead slightly underestimates average NS, but still attains a quite satisfactory result. Overall, the good performance of the GM in this task is not surprising, as its very purpose is to predict bilateral trade flows, and NS are just linear combinations of them. Therefore one expects the GM to be well equipped to predict total world trade, as the errors of linear predictions should compensate themselves in the aggregate.

Fig. 1
figure 1

Observed versus GM-predicted average node total strength. uPPML: unrestricted PPML. rPPML (ZIP, 2nd stage): second stage of the ZIP estimation process, restricted to non-zero flows only. Logit (ZIP, 1st stage): logit model, first stage of the ZIP estimation process. 95 % confidence bands are displayed as error bars around predicted values

The picture substantially changes when we turn to higher-order statistics like ANNS and WCC,Footnote 14 which involve evaluating link weights that are two steps away from the origin node. Let us begin with average ANNS. As Fig. 2 indicates, rPPML is quite successful in replicating average total ANNS, whereas both uPPML and ZIP only get the time trend but fail to well predict the levels. In particular, uPPML seem to completely miss a satisfactory prediction of the binary structureFootnote 15 and therefore strongly underestimates average ANNS levels. ZIP estimates of the binary structure, instead, seem to be relatively more accurate, but this does not allow the method to perfectly replicate observed average values.

Fig. 2
figure 2

Observed versus GM-predicted average total ANNS. uPPML: unrestricted PPML. rPPML (ZIP, 2nd stage): second stage of the ZIP estimation process, restricted to non-zero flows only. Logit (ZIP, 1st stage): logit model, first stage of the ZIP estimation process. 95 % confidence bands are displayed as error bars around predicted values

GM predictive ability worsens when we look at average WCCs, see Fig. 3. In this case, even the rPPML method persistently overestimates average weighted clustering, while both unrestricted PPML and ZIP still behave very badly. The reason why one observes a different behavior of the GM in predicting average ANNS and WCC lies in the way these statistics combine information about binary and weighted network structure. Recall from Table 5 that weighted-network statistics as ANNS and WCC are in fact a mix of link weights and node degrees. Moreover, ANNS averages out neighbors’ strengths, and thus requires knowledge of dyadic relations only. Conversely, WCC coefficients employ information on triadic relationships, involving the knowledge of link-weight triplets. Therefore, small deviations coming from a bad estimation performance of dyadic link weights are amplified when entering the computation of clustering coefficients. Hence, even if the rPPML procedure keeps the binary structure as given, it cannot reproduce average WCC, as prediction errors tend to be magnified.

Fig. 3
figure 3

Observed versus GM-predicted average total WCC. uPPML: unrestricted PPML. rPPML (ZIP, 2nd stage): second stage of the ZIP estimation process, restricted to non-zero flows only. Logit (ZIP, 1st stage): logit model, first stage of the ZIP estimation process. 95 % confidence bands are displayed as error bars around predicted values

More generally, our results on average statistics indicate that the GM performs well only if one fixes the binary structure, and tries to estimate statistics that are linear in link weights. When either the binary structure must be estimated together with link weights or the statistics of interest involve higher-order interaction motifs—like triadic structures in the WCC—the ability of the model to replicate the ITN weighted structure decreases substantially.

To further explore GM performance in replicating ITN weighted topology, we perform two-sample Kolmogorov–Smirnov (K–S) tests to compare predicted versus observed node-statistic distributions. More precisely, given any of the three statistics of interest (total node strength, average nearest-neighbor strength and weighted clustering), we test the null hypothesis that predicted and observed statistics come from the same distribution, by computing test rejection frequencies at 5 % across all simulated instances. The results in Table 6 confirm the message coming from the figures above on population averages. When one fixes the binary structure, the GM is able to reproduce NS and ANNS distributions, while it misses the distribution of WCC. In the case the binary structure must be estimated (uPPML and ZIP), only NS distributions can be satisfactorily replicated.

Table 6 Rejection frequencies at the 5 % level for the two-sample Kolmogorov–Smirnov test statistics

Another fundamental set of stylized facts characterizing the evolution of the ITN concerns the way in which different network statistics correlate (Fagiolo et al. 2009). Figures 4 and 5 show observed versus predicted correlation patterns between, respectively, total ANNS and NS, and total WCC and NS. Note that rPPML are able to correctly predict the existing disassortativity emerging between total country trade and average trade of the partners of a node, but underestimates clustering-strength correlations. Conversely, uPPML strongly overestimates the magnitude of both correlation coefficients, because the severe mismatch between observed and predicted binary structures impairs its ability to capture also the correlation structure. Again, a ZIP procedure seems to perform quite better than uPPML, even if expected disassortativity and WCC-NS levels are statistically different from observed ones.

Fig. 4
figure 4

Observed versus GM-predicted correlation between total node strenght and total ANNS. uPPML: unrestricted PPML. rPPML (ZIP, 2nd stage): second stage of the ZIP estimation process, restricted to non-zero flows only. Logit (ZIP, 1st stage): logit model, first stage of the ZIP estimation process. 95 % confidence bands are displayed as error bars around predicted values

Fig. 5
figure 5

Observed versus GM-predicted correlation between total node strength and total weighted clustering coefficient (WCC). uPPML: unrestricted PPML. rPPML (ZIP, 2nd stage): second stage of the ZIP estimation process, restricted to non-zero flows only. Logit (ZIP, 1st stage): logit model, first stage of the ZIP estimation process. 95 % confidence bands are displayed as error bars around predicted values

Correlation results are in line with recent findings by Squartini et al. (2011a, b), who show that higher-order weighted properties in the ITN cannot be reproduced by any random model that takes as given the observed strength sequence (but does not control for the underlying binary structure). Here we show that a satisfactory replication of ITN properties can be achieved only if one fixes the binary structure and attributes link weights using a PPML-based GM estimation. As soon as the binary structure is badly reproduced, one also looses the possibility to correctly recover weighted-network patterns, primarily because most of weighted-network statistics are inherently dependent on the binary representation. At the very least, some more satisfactory prediction outcomes can be achieved if one focuses on linear transformations of link weights. As a result, weighted topological properties involving third-order statistics as the WCC become very difficult to predict even if one fixes the binary structure.

So far, we have been studying the performance of GM predictions for weighted undirected statistics. In fact, total strength, ANNS and clustering all neglect the directed nature of trade flows and ensuing asymmetries, as they do not discriminate between in and out links (i.e., import and export flows). To check if the foregoing results also apply in the case of weighted-network directed statistics, which instead take fully into account trade-flow directionality, we have studied predicted versus observed values of population averages of such statistics and their correlation. We have focused on in- and out-strength, and the breakdown in four directed statistics of ANNS and WCC (see Table 5). In general, all results obtained above still hold. In particular, rPPML can easily reproduce all versions of average disaggregated ANNS, while it badly estimates the average of all directed clustering coefficients. Both uPPML and ZIP fail to capture average ANNS and WCC. All correlations related to disassortativityFootnote 16 are correctly predicted by rPPML, whereas both uPPML and ZIP always fail. All directed clustering-strength correlation is instead badly reproduced by either procedures, no matter one fixes the binary structure or not. Once again, the ability to predict the binary (directed) structure of the ITN becomes necessary, although not sufficient.

3.2 Binary statistics

Our weighted-network exercises show that a necessary condition for the GM to provide a satisfactorily picture of ITN properties is that one restricts the estimation to strictly-positive trade flows, i.e. if the observed binary structure is taken as given. The fact that binary trade links play a crucial role in explaining ITN weighted topology indicates that any GM model aiming at endogenously estimating binary links must somewhat take into account the discrete nature of the binary ITN and try to obtain a more accurate estimation of the exact location of the zeros in trade matrices.

But is the GM able to correctly predict the binary structure of the ITN? Our ZIP exercises seem to indicate that, to some extent, a logit model seems to better capture the underlying binary structure than a Poisson one. Therefore, in this section we shall ask whether one can employ the independent variables traditionally used in GM equations to predict whether a trade link exists or not using a logit specification (i.e., using the first stage of a ZIP estimation process).

More precisely, for any cross-section \(t\), we estimate:

$$\begin{aligned} Prob\{a_{ij}=1|x_{ij}\}=\frac{\exp \{x_{ij}\cdot \theta \}}{1+\exp \{x_{ij}\cdot \theta \}}=\Lambda (x_{ij};\theta ). \end{aligned}$$
(7)

Since Eq. (7) coincides with the functional form that we fit in the first stage of the ZIP estimation, we can employ first-stage estimates for a zero flow (\(\hat{\psi }_{ij}\)) from the ZIP model and build the predicted probability matrices \(\hat{\Xi }\), whose generic entry \(\hat{\xi }_{ij}=1-\hat{\psi }_{ij}\) represents the estimated probability of observing a directed link from country \(i\) to country \(j\) in that year.

As we did when building predicted weighted ITNs, we generate a sample of \(H\) independent adjacency matrices \(\hat{A}^h=\{\hat{a}_{ij}^h\}\), for \(h=1,\ldots ,H\) where in each sample \(\hat{a}_{ij}^m\) is drawn from a Bernoulli distribution with parameter \(\hat{\xi }_{ij}\), independently across all pairs \((i,j)\). More formally:

Definition 4

(Predicted Binary ITN) The predicted binary International Trade Network, for each given cross-section \(t\), is represented by the asymmetric binary matrix \(\hat{A}\), whose generic entries \(\hat{a}_{ij}\) are independent Bernoulli random variables with parameter \(\hat{\xi }_{ij}\).

In our exercises, we set as before \(H=10,000\). Our main results are reported in Figs. 6 and 7, where we plot observed binary statistics versus Bernoulli-Logit predicted ones (see Definition 4).Footnote 17

Fig. 6
figure 6

Observed versus GM-predicted average statistics in the binary ITN. Logit estimation. Bernoulli: average statistics in the Bernoulli Predicted Binary ITN (see Definition 4). 95 % confidence bands are displayed as error bars around predicted Bernoulli values

Fig. 7
figure 7

Observed versus GM-predicted correlation between statistics in the binary ITN. Logit estimation. Bernoulli: average statistics in the Bernoulli Predicted Binary ITN (see Definition 4). 95 % confidence bands are displayed as error bars around predicted Bernoulli values

To begin with, note that Bernoulli-Logit predictions can exactly replicate average total node degrees. This is not surprising: the predicted binary ITN preserves on average the observed density by construction and average total node degree is simply proportional (by a factor \(N-1\)) to network density. The fact that a Logit estimation is on average able to predict observed density explains why a ZIP model, which employs the very same Logit specification in its first stage, predicts very well average total NS. For that statistic is an average over all existing links and it is not so much affected by where these links are actually located. This is not true of ANNS and WCC, which in fact are not perfectly reproduced by a ZIP model because they require a more precise knowledge of where links are placed.

A similar problem arises in the binary ITN with Bernoulli-Logit predictions: they persistently underestimate observed average ANND and BCC. Again, this hints to an inherent inability of the GM to well predict the presence of a link (see middle panel of Fig. 6).

Things seem to improve a bit when we move to correlation structure. Bernoulli-Logit predictions are able to well capture binary disassortativity and clustering-degree correlation, especially in the last part of the sample. Although on average observed point-correlations are rarely replicated, the inherent variability of this procedure allows one to conclude that there exists a sufficiently large number of simulations where predicted correlations are very similar to observed ones.

However, the relative success of the GM in replicating the correlation structure of binary structure should not be necessarily taken as a virtue of this model. Indeed, as Squartini et al. (2011a, b) have shown, both binary disassortativity and clustering-degree correlation can be easily replicated even by a null random model that preserves the observed (in/out) degree distribution and is otherwise fully random in the way links are created. Note that our exercises (not shown) indicate that the GM attains a relative poor performance also in predicting the observed (total, in and out) degree distributions.Footnote 18 Therefore, from a purely predictive perspective, the GM can hardly be considered any better than other random null models that require much less information and attain a similar explanatory performance (albeit an almost void economic interpretation).

4 Concluding remarks

In this paper, we have studied whether a gravity model (GM), the work-horse theoretical reference in international trade, can explain the statistical properties of the International-Trade Network.

Our exercises show that the GM does a decent job in replicating the weighted-network structure of the ITN only if one fixes its binary architecture equal to the observed one. More generally, the GM performs very badly when asked to predict the presence of a link, or the level of the trade flow it carries whenever the binary structure must be simultaneously estimated. Furthermore, even when the binary structure perfectly replicates the observed one, the GM is not able to explain higher-order statistics that, like clustering, require the knowledge of triadic link-weight topological patterns.

Our binary analysis also shows that the GM turns out to be a good model for estimating trade flows, but not one that can explain why a link in the ITN gets formed and persists over time. In other words, knowing country-specific variables (country GDP, etc.) and country bilateral interactions (bordering conditions, belonging to the same RTA, etc.) is not enough to predict the presence of a link. However, conditional on the information that a link exists, such variables can well predict how much trade that link actually carries. From a binary perspective, the GM can well reproduce the overall density of the ITN (i.e. the number of trade relationships) and therefore the number of zeros in trade matrices, but not where the ones are expected to be located exactly (Eaton et al. 2012). Furthermore, the GM ability in replicating binary disassortativity and clustering-degree correlation is comparable to null network models that perfectly match such statistics by relying only on the knowledge of the degree distribution (Squartini et al. 2012).

Notice that these results are largely independent on which variables are actually entering the gravity equation we fit to the data. In the foregoing exercises, we have used a standard specification where many of the most-employed GM variables enter the regression. We have also tried and augment the equation with other explanatory variables that resulted statistically not significant, but can nevertheless improve the percentage of explained trade-flow variance, without observing any dramatic increase in the goodness of fit of ITN network statistics.

In order to better explain the topological properties of the ITN many alternative strategies may be pursued. First, one may consider to augment a GM specification with network-related variables (Ward and Ahlquist 2012). It may be indeed argued that if standard economic variables entering in the GM are not enough to explain link formation, perhaps this is because the presence of a link between any two countries might be actually explained by the very local structure of the network (e.g., degrees of the two countries, etc.). Of course this introduces some endogeneity to the problem, because the presence of a link in turn affects local network properties. By properly dealing with endogeneity issues in estimation, one can hope to better explain the binary structure of the ITN.

Second, one might borrow social-network statistical methodologies currently employed to model the evolution of directed graphs over time as continuous-time Markov processes (Snijders 2005). For example, one may envisage setups where each single node chooses its outgoing link (i.e. whether to export to another country or not) based on a myopic optimization of some objective function, where the latter may be the result of many firm-level decisions within the origin country.

Finally, one may think to explore international-trade models where the decision of a firm located in country A to export goods to country B, which possibly never imported products from A before, is rooted in a more detailed micro-foundation. This may require to blend together two strands of literature, one on the role of heterogeneous firms in international trade (Melitz 2003; Bernard et al. 2007) and the other on models of trade network formation based on simple aggregate dynamics (Garlaschelli and Loffredo 2004; Bhattacharya et al. 2008; Riccaboni and Schiavo 2010).