Keywords

Introduction

The study of topological properties of complex systems such as economic networks has become very popular. This growing interest may be observed not only in economics and finance but also in natural sciences or medicine.

Our main aim in this paper is to investigate the similarities among world stock market indices. Each index is represented as a node in the network. Edge weights in this network are then connected to pairwise cross-correlations. This network, however, contains too much information. For this purpose, we need to apply an algorithm filtering the network. By filtering we mean the removal of the weakest links. As an algorithm for filtering the network, we utilize the minimum spanning tree (MST) approach. After construction of the MST, we determine stability of its links. In the end, our task is to determine which node is central in the MST network, or in other words, we are trying to answer the question which stock market index plays a role of a hub in the network.

In this paper, we utilize the methodology provided by Mantegna (1999) for the construction of the MST. The MST construction is then accompanied by link stability analysis described in Tumminello et al. (2005) in more detail.

A milestone in the research of topological properties of stock markets is represented by an early work of Mantegna (1999), where he analyses a hierarchical structure of stocks traded on the US stock exchanges. Mantegna analyses portfolios of stocks used to compute S&P 500 and DJIA indices. He finds out that the investigated US stocks cluster according to the industry sector they belong to.

Bonanno et al. (2000) study the links between different world economies through the analysis of time series of the sets of stock market indices. Bonanno et al. (2001) find out that the pairwise cross-correlations between stock returns vary with the changing time horizon.

Although the vast majority of studied stock markets are American or British, there also exist papers focusing on different markets. For instance, Sandoval (2012) studies topological properties of Brazilian stock market, and Situngkir and Surya (2005) researched into the consequences caused by different lengths of the MSTs. Situngkir and Surya (2005) claim that the total length of the MST varies over time. They also find out that a higher length of the MST indicates stock market stabilization after previous monetary crisis.

Tabak et al. (2010) are engaged in the investigation of topological properties of commodities markets. They find out that the commodities form clusters according to the sector they belong to. Methodology similar to the one used in Mantegna (1999) is also often applied to the study of volatility of stock returns, e.g. Micciche et al. (2003).

The paper is structured as follows. In Data and Methodology we present a brief review of methodology utilized in the paper. Data and Methodology also provides data description and subsequent data transformations and statistical tests. We provide the results of our analysis in Results and Discussion. Finally Conclusion concludes the paper.

Data and Methodology

We analyse the price evolutions of 21 most important global stock market indices. Analysed indices named with their tick symbols are presented in Table 1. All data were downloaded from Yahoo Finance.

Table 1 Analysed world stock market indices

Our data span the period from January 2007 to February 2017, altogether 2405 daily observations.

Let us denote P i (t) the adjusted closing price of the i-th index at time t, t = 1 ,  …  , T, where T corresponds to the last observed time period. We are able to express logarithmic daily returns as follows:

$$ {r}_i(t)\equiv \frac{P_i(t)-{P}_i\left(t-1\right)}{P_i\left(t-1\right)},\kern0.75em t=2,3,\dots, T. $$
(1)

Since for our subsequent analysis we need to work with stationary data, we cannot use highly nonstationary adjusted closing prices but rather logarithmic returns defined in Eq. 1 as is common in scientific literature.

The similarities in a group of stock market indices are standardly expressed using pairwise cross-correlations. The pairwise cross-correlation coefficient ρ ij for a pair of indices i and j with logarithmic returns r i (t) and r j (t), respectively, t = 2 ,  …  , T, can be calculated as follows:

$$ {\rho}_{ij}=\frac{\left\langle {r}_i(t){r}_j(t)\right\rangle -\left\langle {r}_i(t)\right\rangle \left\langle {r}_j(t)\right\rangle }{\sqrt{\left\langle {r}_i^2(t)-{\left\langle {r}_i(t)\right\rangle}^2\right\rangle \left\langle {r}_j^2(t)-{\left\langle {r}_j(t)\right\rangle}^2\right\rangle }} $$
(2)

where symbol 〈∙〉 stands for the average over the studied period T. The correlation coefficients range between −1 (perfect anti-correlation) and 1 (perfect correlation). When ρ ij  = 0, then a pair of indices is uncorrelated. Since we assume n stock market indices, we finally obtain a square correlation matrix C = (ρ ij ) i , j = 1 ,  …  , n of size n.

Using a nonlinear transformation proposed by Mantegna (1999), cross-correlation coefficients are then transformed into the distance coefficients d ij as follows:

$$ {d}_{ij}=\sqrt{2\left(1-{\rho}_{ij}\right)} $$
(3)

The distance coefficients range between 0 (perfect correlation) and 2 (perfect anti-correlation). When \( {d}_{ij}=\sqrt{2} \) then a pair of indices (i, j ) is uncorrelated. Finally we may arrive at a square symmetric distance matrix D = (d ij ) i , j = 1 ,  …  , n with zeros on diagonal.

Now we are able to construct a graph G = (V, E), where V is a set of stock market indices and E is a set of edges between these vertices. The weights of the edges are already provided in the distance matrix D. The graph G, however, contains too much information for an easy interpretation. There are \( \frac{n\left(n-1\right)}{2} \) unique edges between pairs of indices in G.

To address this problem, we utilize Kruskal’s algorithm for finding the MST of G. The main idea of the MST procedure is that it filters the edges between pairs of indices such that only the most important ones prevail.

The algorithm proceeds as follows. First it finds an edge with a minimum weight and marks it. It then searches for edges with minimum weights among all the unmarked edges that do not produce a loop with the already marked edges. The algorithm stops when it succeeds in finding a spanning tree, i.e. when the set of marked edges contains n − 1 elements. Kruskal’s algorithm produces a MST which is not unique. A number k of different MST produced by the algorithm may exist, but the sum of weights of the edges in the MST is minimized and unique, or

$$ \sum \limits_{e\in {MST}_1}w(e)=\sum \limits_{e\in {MST}_2}w(e)=\cdots =\sum \limits_{e\in {MST}_k}w(e) $$
(4)

where w(e) :   E i  ⟼ [0, 2] is a weight function that assigns a weight (distance) to every edge e from the set of edges E i  ⊂ E corresponding to the MST i  , i = 1 , 2 ,  …  , k.

After the MST procedure is completed, we arrive at a connected graph that contains only n − 1 unique edges between pairs of indices and does not contain any loop. Now we can carry out centrality measures analysis. The property of centrality of a particular node coincides with a size of influence this node has in the whole network.

We will assess node’s centrality according to the following four criteria: (1) degree, (2) closeness, (3) betweenness and (4) eigenvector centrality. The proper definitions of the above-mentioned measures can be found in Sandoval (2012). For our purposes it, however, suffices to state very informal definitions. Degree of a node is the number of neighbours this node has in the network. The higher the degree, the more influence the particular node has in the network. Closeness of a node measures the average distance from it to all other nodes in the network. The higher the closeness, the more influence in the network the particular node has. Betweenness centrality answers the question whether it is necessary to involve a particular node in communication between any other two nodes, in other words whether the particular node is transitory between other pairs of nodes in the network. If it is, then the particular node is important. Eigenvector centrality boosts the strength of degree centrality by taking into account not just the number of neighbours but also their own influence in the network.

The simplicity of the MST methodology is redeemed by its serious drawback. Sometimes we cannot be sure whether a particular edge in the constructed MST is present because of its relevance for the network or only by coincidence. This edge stability problem can be analysed using bootstrap technique described in Tumminello et al. (2007).

In this method we first construct the original MST. Then we resample or bootstrap the original time series such that the length of the series is fixed. Some observations may be repeated, and some may not be present at all in the bootstrapped sample. Then, we construct the MST from the bootstrapped series, and we record the links. Such procedure is repeated up to 1000 times. The stability of a particular link in the original MST is then determined as a proportion of number of occurrences of a particular link in the bootstrapped MSTs and the number of bootstrap samples 1000.

Results and Discussion

In this section we present and comment on the results of the MST for the studied network of stock market indices. In Fig. 1 we illustrate the MST for stock market indices together with stability of the links.

Fig. 1
figure 1

MST of stock market indices network (Source: Author’s computations)

We detected three main clusters of stock market indices:

  1. 1.

    Asia/Pacific

  2. 2.

    Europe

  3. 3.

    USA + Americas

Therefore, we may argue that the stock market indices do cluster according to the geographical location, which is the main finding of this paper. We also detected that a large cluster of Asia/Pacific can be further divided into two smaller subclusters. N225, AORD, TWII and KS11 are Japanese, Australian, Taiwanese and South Korean indices, respectively. We may also argue that this subcluster is composed of indices of more “western-style-oriented” countries, although the definition of “western-style” country is not very rigorous.

Regarding stability of the edges, there are several findings worth noting. First, the stabilities among stock market indices in all three clusters are very high, often as high as 100%. This again proves large interdependence between indices situated in similar geographical locations. Second, the stability of the transitory edges between these three main clusters, i.e. HSI – ATX and GDAXI – GSPC, is very poor. Thus we may argue that relations and similarity between indices from particular geographical areas/cluster are not that high. Based on this and above-mentioned findings, we may argue that the globalization or integration is very high on the local regional level (in terms of continents), but there exist further possibilities for enlarged stock market integration between individual continents (with the exception of USA + Americas). This might suggest a possibility for international diversification while investing in the stocks, i.e. the similarity between stock market indices and different clusters/regions has not reached the levels of similarity that are observable between indices in the particular clusters/regions.

In order to determine the most influential nodes in the network (MST), we utilized centrality measures described in Data and Methodology. The results are depicted in Table 2 in Appendix. Based on degree centrality, there is always one central stock market index for every cluster (geographic area): for Asia/Pacific it is HSI (Hang Seng Index), for Europe it is FCHI (CAC 40), and for USA + Americas it is GSPC (S&P 500). We can also observe the relative importance of KS11 (KOSPI Composite Index) in the Asia/Pacific cluster which again relates to the possible existence of two main subclusters in the overall Asia/Pacific region.

Using betweenness centrality we can draw several conclusions. There are two most transitory indices in the network – FCHI and HSI which further proves the hypothesis about these two indices being the most important in the network. Relative importance of ATX (Austrian Traded Index), GDAXI (DAX 30) and GSPC is also obvious from Table 2. However, importance of KS11 as a means of communication between other indices was not supported by betweenness centrality. Therefore, we can think of KS11 only as a local centroid not the global one.

Finally eigenvector centrality further supports the local dominance of ATX in the Europe cluster (apart from FCHI). HSI can be thought of as an index with the largest number of the most influential neighbours in the network. The results of closeness centrality further indicate most dominant role of FCHI and HSI indices in the network.

To sum up the centrality measures analysis, we found that

  1. 1.

    Every cluster/region has its own hub: FCHI (Europe), HSI (Asia/Pacific) and GSPC (USA + Americas).

  2. 2.

    FCHI and HSI can be thought of as hubs for the entire network, i.e. these are the most influential indices in the network (world).

  3. 3.

    American indices (mainly GSPC) do not play such an overwhelming role as is usually believed (based on our analysis).

  4. 4.

    European indices (ATX, FCHI, GDAXI) are transitory indices, i.e. they are very open and needed in the communication between other global stock market indices.

  5. 5.

    Indices from Americas region are situated in the periphery of the network without much communication and influence of global importance.

Conclusion

We analysed the relationships between world stock market indices in the period of 2007–2017 using the MST approach. We managed to detect three large clusters of indices based on the geographical location of underlying indices. We proved the integration of stock markets on the level of continents but stated that there are possibilities for global stock market integration enlargement still present. This might suggest a possibility for international diversification when investing in stocks. The clustering into three large clusters may not be based solely on geographic location but also on culture and investment style in these locations. For the question of whether this is the case, however, our analysis cannot provide relevant answers.

We managed to detect large interdependence between US and Americas stock market indices, but we argued that the global importance of Americas indices is low.

Using the centrality measures analysis, we argued that every region/cluster has its own hub: HSI for Asia/Pacific, FCHI for Europe and GSPC for the USA. We also argued that based on our analysis the most important stock market indices in the global scale are HSI and FCHI.

The added value of this paper for the investment society is straightforward. Since we are able to describe both visually and quantitatively the hierarchical structure of the stock market indices network, we are also able to detect those indices situated in the periphery of this network. Following the results of Pozzi et al. (2013), we may thus claim that there can be built a well-diversified portfolio that effectively reduces the investment risk.

There exist many possibilities for further research. Mainly we need to mention the MST dynamics. It requires the observation of the changing MSTs with respect to different time windows. This possibility is motivated by our assumption that the MSTs would differ, if we analysed only precrisis period, crisis period or postcrisis period. This kind of research could suggest whether any particular index had set for different trajectory of closing price evolution owing to the postcrisis stock market consolidation.