1 Introduction

Different major classes of networks are routinely associated with their implied distributions of degrees, i.e., the number of links of nodes they generate. The major prototypes are Erdös–Renyi and scale-free networks. The former are random networks that are characterized by a constant probability of existence of a link which obviously leads to a Binomial distribution of links that converges toward the Poisson distribution for large networks. Scale-free networks somehow mark the opposite end of the spectrum in that they generate a very broad distribution of links via some kind of amplification mechanism (like preferential attachment of new nodes to those that already possess a large number of connections). As a result, the degree distribution emerging from such a generating mechanism is of a very heterogeneous nature and its scale-free behavior corresponds to a power-law decay of the distribution of links over its entire range or at least in the upper tail region. Almost all of the related literature focuses on these two possibilities. However, the Poisson and power-law distributions do certainly not constitute an exhaustive list of candidate distributions for the number of links in a network setting. Indeed classes of networks exist which focus on properties other than the degree distribution and for which no general results for the distribution of links are available. Examples are ‘small-world’ networks which are defined by a small average distance between nodes (Watts and Strogatz 1998) or ‘core-periphery’ networks that are defined by a dichotomic classification of nodes into a core group and its periphery (Borgatti and Everett 1999). Both of these classes might contain members that also share the property of an (asymptotically) power-law-like distribution of links or not. In how much these different categorizations overlap or exclude each other seems to be completely unknown and has not received any attention so far. However, the existence of such alternative categorizations of classes of networks and their pertinent generating mechanisms makes it likely that for some empirical networks, other distributions than the Poisson and scale-free could better describe the data.

This should also apply to financial networks, for which the asserted scale-free behavior had already been disputed in certain cases (cf. Fricke and Lux 2015). Due to the dominance of the Erdös–Renyi and scale-free paradigms, theoretical modeling has typically made use only of these two classes of models (Nier et al. 2008; Haldane and May 2011; Anand et al. 2013; Krause and Giansante 2013). When generating the link structure of a theoretical model in this way, any inference on the stability of the network and its susceptibility to contagion effects after shocks would be determined to a large extend by the (known) properties of the pertinent class of models. Hence, the extent of contagious cascade effects might be underestimated or overestimated because of deviations of these theoretical benchmarks from the empirical structure. It, thus, appears worthwhile to expand the range of candidate distributions and generating mechanisms beyond these classical ones as it appears likely that often the distribution of links is located somewhere between these extremes. A better and hopefully robust characterization of the degree distribution should, therefore, be valuable input to inform the mushrooming literature on network contagion studies of the banking sector.

Continuing the line of research initiated by Fricke and Lux (2015), this paper will look at some intermediate distributions from the large class of compound Poisson distributions (Karlis and Xekalaki 2005) that have been found appropriate for modeling discrete events in various fields but have seemingly not been applied to the discrete variables defined by the counts of the number of links within a network so far. We will focus here on the Poisson–Gamma and Poisson–Pareto distributions along with the original Poisson and discrete Pareto (aka power law or scale-free) distribution and will compare the performance of these four alternatives for three important data sets: one covering interbank credit connections and the other two capturing the network structure of bank-firm loans. As another novel feature within the financial network literature, we will also apply most of the mentioned distributions within a regression framework. In this way, we can identify the influence of certain characteristics of the nodes on their propensity to form links.

We will estimate these models for three large data sets of financial linkages due to loan contracts: interbank loans contracted via the electronic trading platform e-MID, and loans of financial institutions to non-financial firms in the Spanish and Japanese economies. All data sets are available over at least one decade. The e-MID data contain daily recordings of all interbank loans, while the other two data sets have yearly granularity. As it will turn out, heterogeneity is pervasive in all three data sets along various dimensions: There is both a change over time of the shape of the estimated distributions as well as a highly significant influence of whether banks/firms belong to some basic classes of agents that can be distinguished in the data. For the Japanese data set, we can also identify an influence of certain balance sheet statistics on the degrees of banks (for the other data sets, such covariates are not available). These exogenous effects are mostly very robust as they appear in a qualitatively similar way in all distributions under consideration. Irrespective of inclusion of exogenous effects or not, in almost all cases, the Negative Binomial exhibits the best fit and dominates all alternatives at any standard level of significance.

The rest of the paper is structured as follows: Sect. 2 introduces the various distributions under investigation and their use as regression models. Section 3 describes our data, and Sect. 4 provides the empirical results. Section 5 concludes.

2 Statistical models

Since degree distributions are by definition distributions of discrete variables, the present paper confines itself to comparing the performance of discrete distributions. The simplest benchmark is the Poisson distribution given by:

$$\begin{aligned} P(x)=\frac{e^{-\lambda }\lambda ^x}{x!}, \end{aligned}$$
(1)

where x is the number of links (the degree of a node) and \(\lambda \) the unique parameter of this distribution function. We note that empirical degree distributions are typically truncated at zero, simply because pertinent data are only collected for entities that are at least minimally connected to the network under investigation. Hence, in such applications we would have to use the truncated Poisson distribution which is given by:

$$\begin{aligned} P_{T}(x)=\frac{e^{-\lambda }\lambda ^x}{x!(1-e^{-\lambda })}, \end{aligned}$$
(2)

where the additional term in the denominator adjusts for the ‘missing’ zero of the empirical data (note that \(P(0)=e^{-\lambda }\) in the original Poisson distribution).

The Poisson distribution approximates the exact Binomial distribution of degrees in Erdös–Renyi networks with a high degree of accuracy if the networks are not too small. Since all our applications would be based on at least three-digit numbers of nodes, the Poisson estimates should be virtually identical to estimates for a Binomial distribution.

The power law characterizing scale-free networks is usually described and estimated in its continuous version, i.e., \(p(x)\sim x^{-\alpha }\). However, this of course neglects the discrete nature of the data. The discrete counterpart of the continuous Pareto distribution is also known as the Zipf or Zeta distribution, and it is given by the probability mass function:

$$\begin{aligned} P_{\alpha }(x)=\frac{x^{-\alpha }}{\zeta (\alpha )}, \end{aligned}$$
(3)

where \(\zeta (\alpha )\) is the zeta function \(\zeta (s)=\sum _{n=1}^\infty \frac{1}{n^s}\). No adjustment for the lack of zeros is needed in this case as the support of the discrete Pareto covers only positive integers. Besides the elementary Poisson and the discrete Pareto, the most frequently encountered classes of discrete distribution functions are compound Poisson distributions. Two of these are used in this paper: The first is the Negative Binomial (NBD) which results if the parameter \(\lambda \) of the original Poisson distribution (1) is drawn from a Gamma distribution. Note that this amounts to drawing the realizations from a family of Poisson distributions with heterogeneous mean values and hence can be seen as a reflection of heterogeneity of the statistical features of the nodes in a network. We adopt here the following functional form of the Negative Binomial:

$$\begin{aligned} N(x)=\frac{\varGamma (\theta + x) \tau ^\theta (1-\tau )^x}{\varGamma (1+x) \varGamma (\theta )} \quad {\hbox {with}} \quad \tau = \frac{\theta }{\theta + \lambda } \end{aligned}$$
(4)

with \(\varGamma (.)\) the gamma function, \(\varGamma (n)=(n-1)!\), and \(\theta \) and \(\lambda \) the two parameters for the shape of the distribution. Alternative functional forms can be found in Greene (2008). The one of Eq. (4) is preferred in the present context as it can be easily related to the Poisson distribution, since the mean value is in both cases identical to the pertinent parameter \(\lambda \) and the Negative Binomial converges to the Poisson for \(\theta \rightarrow \infty \). The Negative Binomial has become hugely popular in many applications featuring discrete data as it is able to capture the widespread phenomenon of overdispersion, i.e., the variance exceeding the mean. Namely, while it is well known that the variance of the Poisson distribution is \(Var_{P}(x)=\lambda \), for the Negative Binomial we obtain \(Var_{N}(x)=\lambda (1+\frac{\lambda }{\theta })>\lambda \). For applications without zero counts, we also need to adjust the Negative Binomial in an appropriate way to obtain its truncated version:

$$\begin{aligned} N_{T}(x)=\frac{\varGamma (\theta + \lambda ) \tau ^\theta (1-\tau )^x}{\varGamma (1+x) \varGamma (\theta ) (1-\tau ^\theta )}. \end{aligned}$$
(5)

The Negative Binomial enjoys an almost legendary reputation in marketing as the most versatile tool for fitting purchase frequencies of consumer goods. This literature has been initiated by Ehrenberg (1959) and surveyed by Schmittlein et al. (1985).

The last candidate to be considered in this paper is the Pareto–Poisson mixture. This compound model had been studied before in the actuarial literature (cf. Albrecht 1984) and has been proposed by Lux (2016) as a model for the degree distribution of credit networks. The justification for this functional form was the plausible observation that the number of credit links of both banks and non-financial firms is increasing with their balance sheet size (de Masi and Gallegati 2012; de Masi et al. 2011). Taking the size of the underlying entity as a latent variable in a compound Poisson model and taking into account that firm size distributions are close to a power or Pareto lawFootnote 1 leads to a formalization in which the shape parameter of the Poisson distribution is drawn from a Pareto law:

$$\begin{aligned} PP(x)=\int _{{\underline{\lambda }}}^\infty \frac{e^{-\lambda } \lambda ^x}{x!} \alpha \frac{{\underline{\lambda }}^\alpha }{\lambda ^{\alpha + 1}} \mathrm{d}\lambda \end{aligned}$$
(6)

which defines a family of distributions with two parameters, \(\alpha \) and \({\underline{\lambda }}\). A closed-form solution for the integral in Eq. (6) is not available, so that the probability mass function can only be solved via numerical integration. In Eq. (6), \(\alpha \) is the usual shape parameter of the Pareto distribution (note that since the latent variable ‘firm size’ is a continuous variable, we can adopt here the standard Pareto law), and \({\underline{\lambda }} > 0\) is a lower boundary for the latent variable which is necessary to guarantee convergence of the integral. Again, we need the zero-truncated counterpart of Eq. (6) which formally we obtain by setting:

$$\begin{aligned} PP_T (x)=\frac{PP(x)}{1-PP(0)} \end{aligned}$$
(7)

which again is obtained by numerical integration. It is worthwhile to add that most applications (e.g., in marketing) use the Poisson and Negative Binomial as regression models (cf. Hilbe 2007), i.e., apply it for modeling the dependency of variables obeying such distributions on exogenous variables. While network data have to the best of my knowledge only be described via unconditional distributions so far, such a perspective would be most informative if additional information on the characteristics of the nodes were available. The Poisson and Negative Binomial model could be embedded into a regression framework by setting:

$$\begin{aligned} \lambda _i = exp(\mu + \mathbf{y }_{i}' \beta ) \end{aligned}$$
(8)

where \(\mathbf{y }_i\) is a vector of covariates and \(i=1,\ldots ,N\) is the sample of nodes of the network. This adds node-specific heterogeneity even in the Poisson model and, in the case of the Negative Binomial, could be interpreted as a combination of both observable and unobservable heterogeneity, the later being represented by the Gamma mixing distribution.

I am not aware of any previous use of the Poisson–Pareto model within a regression framework. Nevertheless, this family can also easily be cast into such a format. It can be shown that the mean of Eq. (6) is \(E[x]=\frac{\alpha }{\alpha - 1} {\underline{\lambda }}\) and so it seems most natural to allow exogenous effects to enter via \({\underline{\lambda }}\):

$$\begin{aligned} {\underline{\lambda }}=\hbox {exp}(\mu + \mathbf{y }_{i}' \beta ) \end{aligned}$$
(9)

While Eq. (9) is motivated by the Poisson regression framework, it also allows inference on the influence of exogenous factors if the mean actually does not exist, i.e., if \(\alpha \le 1\) holds.

In contrast, no straightforward way suggests itself to add a regression framework to the discrete Pareto distribution, and so we just apply this alternative in its unconditional format. Since not too much knowledge is available in our data set on the characteristics of individual nodes, the regression framework model is used to allow for fixed effects of different years, as well as different categories of actors the nodes belong to and so we can investigate whether this categorization is of relevance for the number of their links. In the case of the Japanese data set, we are able to add non-categorical covariates as these data come with balance sheet information besides the identities of borrowers and lenders.

It appears worthwhile to note that the statistical fitting of degree distributions is just one level of analysis in network research. Another, equally important approach would consist in modeling not the degrees of the actors, but the particular structure of links within the network. In such an analysis, the effects of actor-specific and dyadic characteristics as well as various network effects themselves (such as reciprocity, transitivity or closure of subsets) would be investigated. The method of choice in recent literature for such analyses is the so-called exponential random graph model (ERG) that basically captures all candidate factors of influence in an exponential function determining the linking probabilities between each pair of actors (cf. Lusher et al. 2013). While our fitting of degree distributions cannot shed any light on endogenous factors such as reciprocity in the formation of a specific network, it can provide information on important covariates that should be included when estimating an ERG model for the same data. It is plausible that any significant explanatory variables for the degrees of the actors should exert their influence via a higher or lower probability of link formation of these actors under certain circumstances, and so it would be surprising to not find these variables also entering significantly in an ERG model. The same applies for behavioral analyses of network formation on the base of longitudinal data and actor-based models of link formation over time (cf. Finger and Lux 2017 for such an analysis for the interbank market that yields results from a different perspective that are broadly in harmony with those reported below).

3 The data

We consider three large data sets of credit links: The first covers all transactions in the interbank money market conducted within the electronic trading platform e-MID over the years 1999–2014.

The second data set is a comprehensive database of credit extended from banks to non-financial firms in Spain which has been extracted from the SABI (Sistema de Análisis de Balances Ibéricos) archive based on the public commercial registry in Spain. This complete list of bank connections of all publicly registered companies is available for each year from 1997 to 2008 comprising more than 500,000 links between individual banks and their borrowers. Our third data set is a similarly large record of credit links between banks and non-financial firms in Japan collected by Nikkei Media Marketing, Inc., from financial statements of the firms included. These data are available for us over the period from 1979 through 2011. The Japanese data set also includes a variety of balance sheet items of which we construct some key financial statistics to be included among the covariates of Eqs. (8) and (9) .

All three sources have been used in other studies before: The SABI database has been used by Illueca et al. (2014) who study the effects of the regional expansion of Spanish saving banks during the real-estate boom of the years after the introduction of the Euro. The Japanese data have been investigated from a network perspective by Marotta et al. (2015). The e-MID data feature prominently in quite a number of contributions to financial network theory (e.g., de Masi et al. 2006; Fricke and Lux 2015) as it is the only commercially available data set in this area.

Table 1 Basic statistics of network data

Table 1 provides same basic information on the networks defined by the pooled data of our three samples. “Appendix” covers histograms of the seven degree distributions that we attempt to model with the statistical distributions presented in the preceding section. First, the e-MID interbank market loans give rise to a unipartite network, i.e., there is only one type of actors (banks) that are connected via the provision of credit. We, therefore, have only one degree distribution. To be precise, the underlying data here use the set union of the degree distributions extracted from the 64 available quarters 1/1999 to 4/2014. Following Finger et al. (2013), we use such a large level of time aggregation, since at the high-frequency end (e.g., for daily data), the resulting networks would be very sparse. Presumably, over a short time horizon, only a small fraction of all existing links (credit lines) get activated and, thus, high- frequency data would provide us with a very small sample of what we want to measure: namely, existing contacts between banks that could be used to obtain a loan in the money market if the need arises. This view is also supported by the finding that many network statistics (such as density and reciprocity) are very volatile at the high-frequency end and become more stable at around the monthly to quarterly level of aggregation. Hence, we define a link between two banks to exist if they have been trading at least once with each other within a quarter and merge the 64 distributions of degrees obtained in this way into a single one.

The resulting degree distribution is very broad (cf.  Fig. 1 in “Appendix”): The number of counterparts in the interbank market assumes values from a minimum of one to a maximum of 273 which is close to the maximum number of active banks in any quarter. The slow decay of the histogram could be indicative of a power law, but the visual impression from such histograms is usually not reliable enough to discriminate between a power law and other strongly skewed alternatives. At the same time, the pronounced heterogeneity of the degree distribution could be an artifact resulting from heterogeneity of covariates rather than a particular shape of the distribution. These questions will be addressed in the next section.

The Spanish and Japanese data on bank loans to non-financial firms allow us to study different types of degree distributions: First, the original bipartite network provides us with the degree distribution of banks, i.e., the number of borrowers they extend credit to within a certain span of time (one year in our analysis). Besides this straightforward concept, another type of degree distribution can be obtained from the so-called one-mode projection of the original data set. This is the projection of the original adjacency matrix of links, say A, of the bipartite data onto a symmetric matrix for banks only that identifies whether two banks have at least one lender in common, or do not have any overlap within their group of borrowers. This matrix, say B, is obtained as \(B=A^{T}A\).

The resulting degree distribution provides information on joint lending, i.e., the number of other banks with an overlapping population of client firms. As it turns out, the latter degree distribution for Spanish banks is characterized by very similar statistics like the degree distribution of the e-MID interbank network and the visual appearance of their histograms is also relatively similar (cf. Figs. 1, 3). In contrast, the degree distribution of banks from the original bipartite network is much broader (stretching out to a maximum of about 48,500 links of the most active bank), and the pertinent histogram (Fig. 2) also points to a higher degree of heterogeneity. This is also confirmed by its degree of overdispersion that is actually much higher for the histogram of Fig. 2 with a value of 13,915 compared to those of Fig. 1 (\(\sim 32\)) and Fig. 3 (\(\sim 40\)). For the sake of completeness, we also include the degree distribution of firms from the bipartite network in our analysis. In contrast to banks, this one has a very narrow range with a maximum degree of 10 and is actually characterized by underdispersion rather than overdispersion (the variance being smaller than the mean). It, thus, appears questionable whether the right-skewed distributions would add much explanatory power to a baseline Poisson model.

As concerns the Japanese data set, Fig. 5 shows again a broad, right-skewed degree distribution of banks in the bipartite network. With an overdispersion of 445, this one appears somewhat less extreme than its Spanish counterpart. The degree distribution for the one-mode projection of banks is the only one that clearly deviates from a well-behaved distribution function with decaying probabilities for higher degrees. Quite in contrast, this histogram (Fig. 6) has the smallest probabilities on the left-hand side and higher occupation numbers on the right. Its degree of overdispersion is about 7.5 and, therefore, much lower than those of the other degree distributions for banks. Joint lending of many banks to the same company seems, thus, to be much more common in Japan than in Spain. This is also confirmed by the broader degree distribution of Japanese firms as compared to its Spanish counterpart (Fig. 7). This one again obeys the expected form of a degree distribution.

Figures 8 through 12 provide log–log plots of the inverse cumulative distribution functions of all the bank-related networks of our study. As it is well known, data drawn from a Pareto distribution will be characterized by a linear shape of their density or cumulative distribution in a log–log plot. While estimation of the Pareto index via a regression in log coordinates might not be the most efficient way to identify this parameter, such plots provide a useful indication of whether the data are close to a Pareto law at all. Clauset et al. (2009) showed that the cumulative densities are more reliable as a diagnostic tool and, hence, we have chosen the latter to shed light on the distributional characteristics of our data. In all the plots of Fig. 8 through 12, we have used 100 equidistant bins for the representation of the inverse cumulative distributions. The impression from all these graphical displays appears very uniform: In all cases, there does not seem to be any indication of a linear shape of the cumulative distribution in the tail or in any intermediate region and, hence, the simple Pareto law should be a misspecified model for our data. In general, the figures suggest that in all cases the tail region is thinner than expected under a Pareto law which could be due to either a different overall functional form of the degree distribution or heterogeneity of the nodes.

Similar conclusions can be drawn on the base of rank-size plots (e.g., Gabaix and Ibragimov 2011), that again under a Pareto law would turn out to have a linear shape in a logarithmic representation, but in our case also always show curvature characteristic of an exponential rather than Pareto-type decline (these alternative graphical representations are available upon request). This evidence against the Pareto distribution provides the motivation for the estimation of alternative distributional forms that as an added benefit also allows their application as regression models and, hence, the assessment of the explanatory power of important exogenous variables on the variation of the degrees of banks across types and over time. The next section will report parameter estimates of these alternative models along with the results of explicit tests of goodness of fit of these alternatives against the Pareto distribution and of different specifications against the others.

4 Empirical results

We now move on to the results of estimating various discrete models for the degree distributions computed for these data sets.

4.1 Interbank Loans from e-MID Platform

We first turn to the interbank credit data from the e-MID trading platform. These data have been relatively intensely scrutinized in various previous papers. Among those, de Masi et al. (2006) reported power-law exponents between 2 and 3 for the distribution of degrees. Fricke and Lux (2015) questioned this result showing that the histograms of the degree distribution do hardly resemble a power law. Their results are also confirmed by the obvious nonlinear shape of the cumulative density of these data depicted in Fig. 8. Fitting a variety of both continuous and discrete distributions, Fricke and Lux (2015) find that the power law is dominated by many other distributions in terms of proximity to the empirical distribution (evaluated via the Kolmogorov–Smirnov statistic). Which distribution gets closest to the data, varies with the level of time aggregation and across subsamples of the data.

Here we complement this analysis in various ways: First, we use tests based upon likelihood comparisons. Second, we use a larger sample for comparison, namely all banks that have been operating in the money market within the e-MID platform (while Fricke and Lux have confined their analysis to Italian banks which constitute the majority of e-MID users). Third, we do not only estimate the parameters of unconditional distributions, but also apply the Poisson, Negative Binomial and Poisson–Pareto mixture as regression models which also allows us a certain assessment of the value added of including exogenous variables to explain the distributions of degrees. Since the implementation of a regression framework is less straightforward in the case of the discrete Pareto distribution, we estimate only the one shape parameter of this family.

As explained in the previous section, we have aggregated our data into quarterly networks and we have pooled the degree distribution of the 64 quarters from 1999 through 2014 in our statistical analysis. The later step could be considered problematic as our time span of 16 years covers very different periods: an expansive phase after the launch of this market in which transaction volume and number of market participants had been increasing sharply, the reduction of activity after the outbreak of the financial turmoil in 2007–2008 and the subsequent operation of the exchange at a reduced level of turnover. Since this exchange is operated by a company based in Milan, Italy, it has always been predominantly used by Italian banks. However, the fraction of non-Italian banks has been sharply increasing prior to the crisis and has collapsed again during the aftermath of the financial turmoil. Finger et al. (2013) and Fricke and Lux (2015) observed that both Italian and non-Italian banks have been mostly trading with counterparts from their own group, so that the network consisted of two largely distinct clusters. Given this outline of the history of trading within the e-MID electronic market, we might hypothesize that one might expect both time heterogeneity of the distribution of degrees and an influence of the geographic location of the banks using this system. Because of this clustering, Fricke and Lux (2015) neglected the non-Italian participants and focused their analysis on the majority of market participants operating under Italian law. In our regression framework, we can allow for differences by including country-specific fixed effects. Since except for Italy, other countries are hardly ever represented by more than a handful of banks, we restrict ourselves to using a dummy for the non-Italian origin. In order to account for temporal variation, we additionally include 15 yearly dummies (\(\beta _{2000}\) to \(\beta _{2014}\)) for the years 2000 to 2014.

Table 2 Pooled Italian Interbank Credit, 1999–2014, time dummies and dummies for Italian/non-Italian origin

Table 2 exhibits the results of the estimation of the distributions presented in Sect. 2 for this data set together with the factors entering as determinants of their mean. We find the best fit for the Negative Binomial, followed by the Poisson–Pareto mixture and the discrete Pareto distribution (without exogenous factors). Here and in almost all other applications, the Poisson distribution provides a definitely much worse fit than the other alternatives. The yearly dummies and the dummy for non-Italian banks are all highly significant according to their t-statistics. The coefficients are very close to each other for both the Poisson and Negative Binomial distributions and behave qualitatively similarly under the Poisson–Pareto mixture. Essentially, the coefficients depict an almost monotonic decline of the mean degree which is first caused by mergers and acquisitions and the resulting reduction of the number of market participants and later by the strong decline of interbank trading during the financial crisis. Coefficients are, in fact, almost identical for the years 2002–2007, and the years 2009–2014, respectively, so that one can recognize the well-known phases in the development of this market. The dummy for non-Italian banks is almost exactly \(-\) 1 for all three models. This effect has to be seen in relation to the other parameter estimates. For instance, for the negative binomial distribution, this would imply an expected degree of 72.60 for Italian banks in 1999 (from \(\beta _0=4.287\)), while non-Italian banks would be expected to only have 25.61 links on average. This expectation does not exist in the case of the Poisson–Pareto as the estimated tail index of the Pareto is \(\alpha =0.80\). In contrast, the discrete Pareto distribution would indicate existence of the first moment. It seems remarkable that despite the high level of overdispersion, the fat-tailed Poisson–Pareto and discrete Pareto distributions are both inferior to the Negative Binomial.

As it is also indicated in Table 2, likelihood ratio tests clearly reject restricted models without all dummies for all distributions that have been used in these regressions. Table 3 shows results of a number of additional tests: The Poisson which is nested in the Negative Binomial is rejected at all traditional levels of significance. Further, a sequence of Vuong tests (Vuong 1989) shows that the Negative Binomial significantly outperforms the Poisson–Pareto and discrete Pareto, and the same is obtained for the Poisson–Pareto against the discrete Pareto. When adjusting the Vuong test for the difference in estimated parametersFootnote 2 (18 in the case of the Negative Binomial and Poisson–Pareto against only one for the simple discrete Pareto), only the advantage of the Negative Binomial remains, while the parsimonious discrete Pareto would be preferred under this criterion to the Poisson–Pareto with exogenous factors. We have finally compared the Negative Binomial and Poisson–Pareto without exogenous factors to the discrete Pareto (keeping only parameters \(\theta \) and \(\mu \) or \(\alpha \) and \(\mu \), respectively) and find the Negative Binomial and the Poisson–Pareto to appear still superior to the discrete Pareto. Since here, the first two alternatives still enjoy the advantage of one more parameter the adjusted version of the Vuong test can also be applied, which leaves the pattern of dominance unchanged.

Table 3 Specification tests for e-MID data

4.2 Spanish bank-firm credit network

We now move on to the analysis of the degree distributions extracted from the bank-firm credit network for the Spanish economy over the years 1997–2008. Since this is a bipartite network, it allows us to investigate degree distributions under different perspectives: Tables 4 and 5 depict the results for the degree distribution of banks within the bipartite network, Tables 6 and 7 exhibit the results obtained from the so-called one-mode projection of the original data set. Tables 8 and 9 present results for the degree distributions of firms from the original bipartite adjacency matrix. The one-mode projection for firms is less interesting as the number of joint lenders assumes very small values throughout.

Table 4 Spanish Bank-Firm Credit, 1997–2008, degrees of banks with dummies for years and type of banks
Table 5 Specification tests for fitted degree distributions of banks in Spanish credit network
Table 6 Spanish Bank-Firm Credit, 1997–2008, degrees of banks from one-mode projection
Table 7 Specification tests for the degree distributions of banks ‘co-lending’ degree obtained from the one-mode projection of the bipartite Spanish firm network
Table 8 Pooled Spanish Bank-Firm Credit, 1997–2008, degrees of firms
Table 9 Specification tests for fitted degree distributions of firms in Spanish credit network

Here, we use the original yearly records as basic input which we merge all into one data set allowing, however, for both differences in the mean in each year and differences in mean due to the category of banks in our data set. The latter categories are: commercial banks, saving banks and credit cooperatives. Particularly, the latter are typically very small, local institutions which only provide credit to very few borrowers. Not accounting for their different behaviors would certainly introduce an element of misspecification into any statistical model of network links. We take commercial banks as the default case and introduce dummies with coefficients \(\gamma _{sb}\) and \(\gamma _{cc}\) for saving banks and credit cooperatives, respectively.

Starting with the degree distribution of credit relationships of banks to non-financial firms, we have already noted that this distribution is characterized by an even larger degree of overdispersion (\(\sim \) 13,900) than the interbank network. All dummies except for the first two years are significant under the Poisson and Negative Binomial, and pertinent coefficients are again very similar for these two distributions. This data set shows an increase in activity over time which squares with the deregulation of the Spanish banking sector and particularly the regional expansion of activity of saving banks. The dummies for bank categories show a slightly negative effect for savings banks significant only under the Poisson model and a much stronger negative effect for credit cooperatives that is significant in both the Poisson and Negative Binomial models. To get a feeling of the relevance of the coefficients, note that the average degree of commercial banks in the first year, 1997, would have been 427 under the negative binomial, while the dummy coefficients of \(-\) 3.29 reduce this number to only \(\sim 16\) for credit cooperatives. Results for the coefficients of exogenous variables differ under the Poisson–Pareto model. In particular, while roughly a positive time trend is found, the coefficients for the two categories are positive rather than negative which particularly for credit cooperatives contradicts the basic features of the data. My conjecture is that due to the pronounced fat-tailedness of the estimated mixture distribution (\(\alpha =0.33\)), the non-stationarity of the resulting model makes identification of exogenous effects very hard as with such a tail index the realizations of the process would be expected to show immense variation anyway. While again the likelihood ratio tests indicate that the dummies are jointly significant for all candidate distributions, the improvement in the fit of the Poisson–Pareto obtained by inclusion of covariates appears quite small in absolute terms compared to the other cases. This underscores that many of these dummies do not contribute much in this particular case.

The discrete Pareto in contrast would again indicate finiteness of the mean, and its estimated shape parameter \(\alpha =1.21\) is very close to the one obtained in Table 2 for the degree distribution of interbank credit data. While under the comparison of the likelihood values without adjustment, the Vuong test in Table 5 indicates superiority of the Negative Binomial, adjustment for the number of parameters turns the comparison upside down in favor of the discrete Pareto. This changes, however, if the dummies are discarded and only the two shape parameters of the Negative Binomial are used. Hence, the two-parameter baseline Negative Binomial provides a better fit than the discrete Pareto, so that the result for the adjusted Vuong test might indicate an excessive adjustment to the case with 15 parameters (or that the parameters for the covariates contribute relatively little in this case.) The same applies to the comparison between Poisson–Pareto and discrete Pareto. The result of the comparison of the adjusted Vuong test between the Negative Binomial or Poisson–Pareto and the discrete Pareto might also seem cumbersome as it suggests to neglect known heterogeneity within our data set. This shows that strong heterogeneity within a data set could, in principle, lead to a spurious fit of a homogeneous power-law distribution.

Table 6 turns to the results for bank’s ‘co-lending’ degrees from the one-mode projection. One surprising finding here is that the time dummies are almost never significant under all three regression models. Hence, despite the structural changes of the banking sector and its credit relationships to non-financial firms in this period, banks seemed not to have become generally more connected via joint exposures to the same borrowers. In contrast, the dummies for bank categories are both significant and have identical signs under all three regression models: While saving banks are more connected with other banks via joint borrowers, cooperatives are much less connected. The former finding squares with the observation that saving banks often came in as additional providers of credit to certain firms during the time of their regional expansion (cf. Illueca et al. 2014). Again, we find the Negative Binomial to perform best, followed by the Poisson–Pareto, the one-parameter discrete Pareto and finally the Poisson regression. The first ranks are relatively close so that the choice between Negative Binomial and discrete Pareto depends on whether one uses the adjusted version of the Vuong test or not (cf. Table 7). Using only two parameters, the decision is always in favor of the Negative Binomial (which seems plausible given that most of the regression parameters are not significant). Similarly, the Poisson–Pareto dominates over the discrete Pareto under the non-adjusted Vuong test, but this result changes under adjustment for the number of parameters. The same applies to the two-parameter Poisson–Pareto without exogenous factors. These patterns are also preserved if we include the dummies for the type of financial institution (indicated by the addition “4 params” in Table 7). As for parameter estimates, it is interesting to note that the tail indices of both the Poisson–Pareto and discrete Pareto are extremely close to their counterparts of Table 2 (while being different between both models).

Tables 8 and 9 exhibit the results for the degree distribution of firms from the original adjacency matrices of the bank-firm credit network. One notes that these data are characterized by a variance smaller than the mean, i.e., underdispersion. Hence, there would not necessarily be a reason to turn to fat-tailed alternatives to the Poisson distribution. Still, we find the Negative Binomial and Poisson Pareto fitting the data significantly better than the Poisson. Here we only have used time dummies as the bank categories obviously cannot be brought in directly (one could, however, use them to test whether the type of creditor banks would make a difference). In all three regression models, time dummies show a monotonic decrease, i.e., firms have decreased their average number of creditors (from 1.65 in 1997 to 1.16 in 2008 according to the results of the Poisson model). This seems at first view a surprising result as the geographic expansion of savings banks has often led to additional lenders coming in for single firms. One reason for the overall negative trend could be that the general increase in the number of registered firms over this period of strong growth of the Spanish economy has brought many new firms into the database that initially started out with a single lender and, thus, had a dampening effect on the average. In the absence of overdispersion, in fact, the present models basically capture time variation of a narrow distribution of relatively small entries. Note that the Poisson beats the Negative Binomial in this case if no regression parameters are included (second line of Table 9), but falls back behind other alternatives when covariates are added. From these test results, it appears that both the Negative Binomial and Poisson Pareto models provide for more flexibility in including effects of time-varying covariates.

4.3 Japanese bank-firm credit network

Results from our third data set, the network of bank-firm credit in the Japanese economy, are provided in Tables 10 through 15. Since this record covers a span of more than thirty years, we have abstained from using annual dummies. Instead, we distinguish between three historical episodes as potential candidates for fixed effects: the time up to the climax of the Japanese bubble, the more stagnative period afterward and the recent crisis period. Hence, we impose dummies for the years 1990–2007, and 2008–2018, respectively. In addition, similarly like in the Spanish data set, we can distinguish between different categories of financial institutions. We take as the default category large private banks (labeled ‘city banks’ in our data set) and define as a second category that of regional banks (those designated explicitly as only regionally active banks in the data set as well as those identified as Shinkin banks, which are regionally operating credit cooperatives). As a third category, we define insurance companies active in the lending market together with so-called long-term credit banks that are both identified as different classes in our database, since both of them should be more active as long-term lenders pursuing business models different from those of ‘city banks’Footnote 3.

Table 10 Pooled Japanese Bank-Firm Credit, 1979–2011, Degrees of Banks time dummies are for the years 1990–2007 and 2008–2011

Since the Japanese data come with official balance sheet information at the end of the fiscal year (31 March), we can also add node-specific financial information. In the analysis of banks’ degree distributions, the following statistics have been used: the ratio of deposits over total assets, the ratio of equity over total assets, and the ratio of net income over total assets. For all three covariates, we could argue that a positive effect should be expected: Banks with a higher deposit base should be able to extend more credit, and those with higher equity base should be more attractive as lenders. Similarly, higher net income should provide more scope for additional lending activity. There is unfortunately a mismatch between the reporting of credit links over calendar years and the Japanese budget year that ends after the first quarter of the calendar year. Since the budgeting reporting covers three quarters of the previous year, we have matched the balance sheet data with lending activity of the previous year. In contrast to the Spanish data set, we also have more detailed information on firms (including balance sheet information) which here we only use for a binary classification: Namely, in the Japanese case, our selection of non-financial firms as recipients of loans covers those companies only that are listed on an official exchange. While the range of companies covered in this way has been relatively constant from 1979 through 1995, the establishment of the new market and its index JASDAQ has greatly expanded the scope of the database as of 1996. It would be very questionable whether the firms operating in the New Market would share the structure of loan relationships of the old industries, and so it appears sensible to distinguish between both groups. To do so, we impose a dummy for firms listed in JASDAQ as well as in its later replacement called Hercules. Again we apply the same chain of model estimations and specification tests as for the other data. Tables 10 and 11 provide results for the degrees of banks in the original bipartite networks, Tables 12 and 13 those for the one-mode projection for banks’ co-lending relationships and Tables 14 and  15 those for the degrees of firms.

Table 11 Specification tests for the fitted degree distributions of banks from the Japanese credit network
Table 12 Pooled Japanese Bank-Firm Credit, 1979–2011, degrees of banks from one-mode projection time dummies are for the years 1990–2007 and 2008–2011
Table 13 Specification tests for banks’ ‘co-lending’ degrees obtained from the one-mode projection of the bipartite Japanese credit network
Table 14 Pooled Japanese Bank-Firm Credit, 1979–2011, degrees of firms time dummies are for the years 1990–2007 and 2008–2011
Table 15 Specification tests for the fitted degree distributions of firms in the Japanese credit network

Starting with the degrees of banks in the bipartite network, we find somewhat different results for the fixed effects across models: The Poisson model indicates a slight increase in links in 1990 to 2007 and a smaller positive effect (against the benchmark of 1979 to 1989) thereafter. The Negative Binomial has no significant time dummies, and the Poisson–Pareto only diagnoses a significantly positive effect in the first period. Regionally active banks have distinctly fewer links leading to a significant negative dummy for this category throughout. The dummy for insurances/long-term credit banks also indicates that they have somewhat smaller degrees than city banks. For the balance sheet statistics, the deposit-to-asset ratio enters significantly negative in all models which indicates that the most active banks are not those that rely heavily on deposits for funding. In contrast, the equity-over-asset ratio positively predicts the number of credit links, while the models are split in terms of the sign of the net-income-to-assets ratio. However, a glance at the likelihood ratio tests indicates that the balance sheet variables contribute only a small part to the overall explanatory power: The huge statistics that we get when testing for joint significance of all covariates are reduced to comparatively small (though significant) numbers when testing only for the significance of the three balance sheet variables.

In terms of shape parameters, we find that the Poisson–Pareto again has a very small tail index \(\alpha \) close to 0.6 which would indicate nonexistence of the theoretical mean, while the discrete Pareto again turns out a value of about 1.2 (close to the pertinent results for the Spanish credit network). As we see in Table 11, the dominating model is again the NBD model which is preferred over all others independently of whether dummies are included or not. The Poisson–Pareto is also always preferred over the discrete Pareto irrespective of whether dummies are included and whether the Vuong test is adjusted for the different number of parameters.

In the one-mode projection of the banking sector (Table 12), we find a clearly significant negative effect for the period 2008–2011, and a smaller, but also significant tendency of reduced co-lending in 1990 to 2007 compared to the years before. The regional banks are found to be less connected than city banks in all specifications, and a similarly significant effect holds for the third category of insurance companies and long-term credit banks. While all these effects are uniformly found for all models, for the balance sheet variables only the deposit ratio and the net income ratio are significant across all models: The former again has a negative effect, while the latter comes with a positive coefficient. Equity is only significant in the Poisson and Poisson–Pareto models. Similarly as in the bipartite network, the joint contribution of the balance sheet effects is significant, albeit much smaller than the contribution of the remaining covariates. We only find supportive evidence for a positive effect of the equity base and net income on lending activity, while the negative relationship with banks’ deposit base appears counterintuitive and calls for further detailed analyses. The ranking of models is the by now ‘usual’ one with NBD dominating over the Poisson–Pareto and discrete Pareto and all three of them clearly outperforming the simple Poisson model. Specification tests in Table 13 show that the differences in likelihoods are reflected in a preference for the ‘better’ performing Negative Binomial model according to the Vuong test at all traditional confidence levels. The tail index of the discrete Pareto is again remarkably close to its counterpart in the one-mode projection of the Spanish credit network, but the tail shape parameter for the Poisson–Pareto comes out much higher.

Finally, Tables 14 and 15 provide results for the degree distribution of non-financial firms receiving loans from the Japanese banking sector. We first note that the mean degree of 9.47 is much higher than the pertinent value for the Spanish firms. The reason is likely that the restriction to publicly listed entities in the Japanese companies leads to a selection of relatively large firms. In contrast, the Spanish data set is based on the Spanish commercial register and, thus, provides a much broader, nearly comprehensive sample of the population of firms operating in the Spanish economy. In contrast to the Spanish case, the degree distribution is also characterized by overdispersion justifying the estimation of our various fat-tailed alternatives to the elementary Poisson distribution. In all models, the dummies for the stagnation and crisis period as well as the one for firms listed in the new market are significantly negative. Hence, the number of credit links per firm has decreased first after the burst of the Japanese bubble and even more so after the onset of the worldwide financial crisis. For instance, under the estimated parameters for the Negative Binomial, the average number of links has decreased from 13.3 in the first period via 9.5 during the stagnation to 6.5 in the post-crisis years. At the same time, an average firm from the new market that started during the second period would have received credit simultaneously from 4.1 banks.

In the specification tests, we see a somewhat unusual outcome as the Poisson–Pareto dominates over the Negative Binomial. While the margin is small, the difference is significant at all traditional confidence levels. However, the estimated parameters for the Poisson–Pareto are also unusual as with an estimated \(\alpha = 1.915\), its shape parameter is much higher than the estimates for the banks’ degree distribution in the bipartite network. While Spanish firm degrees do not exhibit overdispersion, in the case of Japanese firms, the tail shape estimate of about 2 still indicates some degree of tail fatness which among our selection of models is best captured by the Poisson–Pareto. The discrete Pareto has more a typical parameter estimate (\(\alpha = 1.403\)) but is dominated under all perspectives by both the Poisson–Pareto and the Negative Binomial irrespective of whether fixed effects are included as net.

5 Conclusion

Our analysis has demonstrated that heterogeneity is pervasive in the degree distributions extracted from credit networks of different origin. Almost all time dummies and fixed effects for different categories of actors that we have included in our estimations turned out to be highly significant. Hence, we can safely conclude that the structure of network formation in the markets under consideration has changed over time and that different types of actors behave in different ways in these markets. It would therefore be misleading to model the degree distribution of a financial network with any specific unconditional distribution without taking into account the heterogeneity of the data. When accounting for such known sources of heterogeneity, we find in five cases (banks’ degrees in the interbank market as well as in the Spanish and Japanese loan markets and their co-lending degrees in the same markets) a clear dominance of the Negative Binomial model. The same applies to firms’ degree distribution in the Spanish loan market whereas the Japanese firm degree distribution provided the only case of a dominance of the Poisson–Pareto distribution (which in all other cases was inferior to the Negative Binomial).

If we neglect heterogeneity, we often find the Negative Binomial and the Poisson–Pareto in the vicinity of the one-parameter discrete Pareto distribution. What is more, the discrete Pareto turns out estimates across all our seven empirical degree distributions that are within a very narrow range (all between 1.2 and 1.4 with the only exception of the Spanish firm degrees). The apparently good fit of power laws that had been reported in previous publications might, then, be an artifact of lumping together different categories of nodes. If a sample contains different classes of agents with different orders of magnitude of links, it might erroneously lead to the impression of a very fat-tailed unconditional distribution. As shown in Sect. 3, indeed already the inspection of the cumulative distributions in a log–log plot provides evidence against a Pareto law characterizing the degree distributions of the data under scrutiny. The absence of scaling behavior in the tails or any intermediate region of the cumulative distribution underscores that any previously reported evidence for power laws in financial degree distributions could be a spurious outcome caused by neglected heterogeneity in the data. Taking the heterogeneity documented in this paper into account in network studies of contagious defaults should also improve our assessment of the risk of systemic disturbances in loan markets.