1 Introduction

1.1 Background

In today’s era of economic volatility, staying abreast of market trends is critical for investors to maximize returns and minimize risk. Progress in computing tools, greater data availability, and new statistical approaches have revolutionized how researchers analyze and interpret market dynamics [1]. Evaluating stock market volatility is a critical component of financial analysis. However, conventional methods utilizing standard deviation and variance have been extensively utilized for this objective. Although these methods are based on price data and they have a certain degree of effectiveness, they are confronted with significant problems. An example of such a limitation is the assumption of volatility, which disregards the non-linear nature of financial markets. Furthermore, the traditional approaches might not completely consider the interdependence among stocks or market factors. Failing to account for the impact of market events or external shocks that significantly affect volatility is a limitation of these methods. Overcoming these challenges is essential to achieve a more accurate comprehension of stock market volatility [2].

Representing the stock market as a network, with each stock as a node and their connections forming the edges, statistical network analysis provides a unique perspective on how the market behaves and shifts over time [3]. This methodology enables researchers to identify obscure patterns, interdependencies, and systemic risk that may not be easily detectable with the traditional time-series techniques [4]. Additionally, network analysis facilitates the identification of crucial market players and clusters of stocks with comparable performance. The quantification of stock linkages and interactions enables researchers to comprehend how information disseminates throughout the market. This knowledge is essential for risk management, asset pricing, portfolio diversification, and market projection. As financial markets grow in complexity and interconnectivity, statistical network analysis plays a critical role in providing a thorough evaluation of stock market volatility [5].

The application of network theory offers a strategy to examine and visualize the interconnections and dependencies among various entities [6]. In the realm of the stock market, this approach facilitates the comprehension of the interactions and influences between individual stocks [7]. By employing nodes to represent stocks and edges to signify their associations, this methodology enables the quantification of the direction and potency of these connections. Applying this strategy yields insights into stock clustering, movement, and mutual impacts, providing researchers with a comprehensive grasp of market dynamics. Scrutinizing stock market attributes, including centrality, clustering coefficients, and community structure, using network theory, allows for a deeper understanding of how the market is arranged and functions [8]. Volatility, which measures the extent and frequency of price fluctuations, encompasses not only the risk linked to individual stocks but also their reciprocal impacts. The measurement of volatility enables investors and analysts to thoroughly assess the magnitude of risk and make informed decisions. To accurately assess and manage market volatility, it is crucial to adopt measurement techniques that are proficient at capturing these interconnections [9, 10].

As financial markets continue to evolve and become more complex, there is an increasing demand for methods to precisely assess stock market volatility. Hence, there is a necessity for approaches that can overcome these challenges. In this study, a novel measurement technique that promises to address these shortcomings, partial mutual information-based distance (PMID), is employed. This approach combines concepts from information theory and network analysis to quantify the interdependence between stocks considering both indirect relationships. By incorporating this information, which measures the unique information shared between stocks after accounting for information from other stocks, PMID offers a more nuanced and precise assessment of volatility [11].

This study is organized as follows: The Introduction section provides an initial understanding of the subject that is followed by the Motivation section emphasizes the driving factors of the study, and the Literature Review section assesses prior works. The Methodology section rigorously outlines the data transformation process, distance metric application, and network construction. The Results section describes data splitting techniques and dissects network characteristics at the node, cluster, and global levels. The Discussion section explores the implications of the findings and introduces resilience analyses using Markov chain and MCC analysis. The paper concludes with a concise conclusion section that encapsulates the key implications.

1.2 Motivation

This study was prompted by a research gap in the use of variance and standard deviation for measuring market volatility. Nonetheless, these measures have limitations that may impede their ability to effectively capture market dynamics. One key limitation is that they assume volatility remains constant over time, overlooking the fact that volatility tends to cluster in markets [12]. This oversimplification ignores volatility and fails to account for sudden bursts or periods of calm in market activity. Additionally, traditional measures of volatility primarily focus on the spread of price returns, disregarding the interdependencies and relationships between assets that can impact volatility. These measures are not ideal for capture the interplay of factors and relationships between assets that contribute to risks. Consequently, relying on measures of volatility can result in an incomplete and inadequate understanding of market volatility, emphasizing the need for more nuanced approaches.

The second reason for conducting this study lies in the increased importance of researching the Chinese economy due to its global significance. As an economic powerhouse, China’s market dynamics and trends have a substantial impact on the international financial environment. Therefore, it is imperative to conduct a thorough analysis of the Chinese stock market, particularly in times of unprecedented events such as the global COVID-19 pandemic. Among the unprecedented challenges posed by the global COVID-19 pandemic, undertaking a comprehensive network analysis of the Chinese stock market not only promises to shed light on the intricate dynamics of financial systems but also holds the potential to unravel the market’s resilience despite adversity. This ambitious endeavor presents an opportunity to dissect the intricate interplay of stock prices, investor sentiment, and market volatility within the context of a pandemic [13]. By meticulously examining the underlying connections and dependencies among various sectors, industries, and individual stocks, this study provides invaluable insights into the market’s response to volatile-induced fluctuations. Through a meticulous dissection of the network’s topology and evolution, a deeper understanding of emergent patterns and trends can be unearthed.

1.3 Literature review

1.3.1 Interpreting financial markets through network analysis

In recent years, there has been significant interest in deploying complex network analysis to financial data [14]. In particular, scholars have concentrated their efforts on the examination of financial market interconnections through correlation-based network methodologies. This focus stems from the recognition that the stock market exerts a direct or indirect influence on various financial arenas [15]. By crafting an intricate web of associations among economic parameters, such as financial assets or currency exchange rates, investigators can acquire profound insights into the underlying framework and conduct of the marketplace. This approach facilitates the pinpointing of pivotal nodes or junctions with substantial sway over market dynamics, while also facilitating the unearthing of trends and associations conducive to the development of trading tactics and risk oversight [16]. For instance, through a meticulous evaluation of linkages between financial parameters, scholars can spot critical junctions that wield significant influence over the holistic panorama of market dynamics [17, 18].

Numerous essential principles warrant attention when employing statistical network analysis in the context of financial markets. Initially, establishing the financial network entails the specification of individual variables or entities present in the market, along with the recognition of connections or associations linking them [19]. These associations may originate from a multitude of factors, encompassing correlations, synchronized movements, or causal relationships. Second, a critical aspect involves determining the magnitudes or strengths of the ties connecting variables. These magnitudes are gauged through linkage coefficients, which gauge the robustness and orientation of associations [20]. Third, it becomes imperative to scrutinize the structural layout of the financial network to gain insight into the comprehensive arrangement and interconnectedness of the market [21]. Through a comprehensive examination of the financial network’s structure, analysts can pinpoint influential nodes or junctions that exert a substantial impact on the overarching dynamics of the marketplace [22]. Furthermore, the application of intricate network theory facilitates the exploration of statistical features and tendencies within financial domains. For example, scholars have employed complex network analysis to explore the correlation patterns among multiple assets in both the equity and bond sectors [23, 24].

Several studies employ network-based approaches to scrutinize various aspects of financial markets, including network measures, filters, and spectral analysis. These methods offer valuable insights into complex patterns within financial data, aiding in relationship identification, risk assessment, diversity management, policy decision-making, crisis analysis, and forecasting improvement. Tsankov examines financial markets using network-based methods. The study covers different techniques, including network measures, filters, and spectral analysis. The main focus is on applying these methods to understand complex patterns in financial data. They help identify relationships, assess risk, manage diversity, make policy decisions, analyze crises, and improve forecasting [25]. Chun-Xiao Nie introduces a new method to detect important changes in financial correlations. This approach uses mathematical distances and influence strength. It successfully spots major shifts in correlations, seen in events like the 2008 financial crisis and the disruptions of 2020 [26].

Tao You and colleagues study the Shanghai Stock Exchange using network analysis and unique measures based on information. They challenge the idea that non-Western markets are riskier and show that the Chinese market has its own stability. Using mutual information-based measurements, they reveal insights into the complex connections of the Shanghai stock market [27]. Chu and Nadarajah apply network analysis to the UK stock market, following an approach used for the US market. Their research examines how connected nodes are and fits specific mathematical models to the data. This work deepens our understanding of how financial networks are structured in the UK [28]. Hatami and others suggest an innovative way to study financial markets. They combine networks of correlations with population analysis. Applied to stock data from 2000 to 2004, this method highlights behavior patterns and groups of related entities [29].

Cutting-edge methodologies in the realm of statistical network analysis have been harnessed to enrich our comprehension of the intricate dynamics in financial markets [30]. One frequently employed technique is the minimum spanning tree, originating from the realm of physics, which provides a straightforward yet resilient avenue for examining the topological arrangement and statistical attributes characterizing financial markets [31]. At the core of the minimum spanning tree approach lies the creation of a network that represents financial variables and their interconnected relationships. The strength of these connections is quantified through linkage coefficients, which determine the weights of the edges. These methodologies empower researchers to discern the most prominent and influential nodes within the network, offering invaluable insights into the fundamental framework and behaviors exhibited by financial markets [32, 33].

Several papers delve into exploring the complex systems of financial markets, underscoring the significance of higher order interactions. One notable study by Scagliarini et al. investigates information flows in cryptocurrency markets through the analysis of a cryptocurrency trading network. Utilizing Granger causality, the research examines both pairwise and high-order statistical dependencies in the logarithmic US dollar price returns of the network, offering insights into its stability, influential nodes, and the impact of major events. The study highlights the substantial role of stable coins in high-order dependencies, shaping the intricate dynamical landscape of the cryptocurrency network [34]. In another study, Musciotto et al. introduce an analytic approach for filtering hypergraphs in real-world systems, with an emphasis on higher order interactions beyond dyads. Through the identification of over-expressed hyperlinks, this method discerns informative connections from noise, presenting a fresh perspective on understanding statistically validated hypergraphs. The combination of these papers provides a comprehensive outlook on information dynamics in financial systems, stressing the significance of incorporating both pairwise and high-order analyses to unravel complex network behaviors and interactions [35].

1.3.2 Complex network analysis in Chinese financial markets

Complex network analysis has become a valuable tool for gaining insights into the behavior of financial markets in China, revealing intriguing properties, such as small-world characteristics and shedding light on the dynamic interactions within the Chinese financial landscape. For instance, the Chinese financial market exhibits a small-world property, where there is a high degree of clustering among vertices with a short average path length between them [36]. Pan investigates the intersection of global financial networks and regional development through a case study of Linyi, a prefectural-level city in China. This study showcases how regional economies, as exemplified by Linyi, have become integrated into the global financial network through the international listing of leading regional companies. By engaging with international business service firms and listing on foreign stock exchanges, these firms create worldwide channels for both capital and knowledge flow. This work highlights the strategic opportunities for regional development by leveraging global financing avenues, and demonstrates how regions can thrive within a globalized financial ecosystem [37].

Tu introduces an innovative approach to the construction of financial networks in the Chinese stock market based on co-integration, departing from the conventional correlation-based methods. This method offers a novel perspective on the underlying relationships among stocks, capturing connections that go beyond mere correlations. The study employs various techniques to filter information within a complex network, resulting in a pruned network structure. This approach enhances our understanding of the mechanisms driving financial markets and provides insights into the dynamics of the Chinese stock market’s complex network [38].

Qiu et al. explored the dynamic behavior of financial networks using static and dynamic thresholds, based on data from both the American and Chinese stock markets. This study uncovers how dynamic thresholds influence network behavior by mitigating large fluctuations resulting from cross-correlations of individual stock prices. This research provides insights into the evolving topological structure of financial networks, revealing long-range correlations and degree distributions [39].

Another study by Huang et al. employed complex network analysis to identify influential nodes in the Chinese A-share market. Over 100 stock market networks were constructed using various tests and methods, covering 847 stocks from January 2006 to June 2019. Notably, during financial crises, network metrics such as clustering coefficient and global efficiency surged and then dropped. Around 66.98% of networks displayed scale-free properties. Influential nodes were primarily large-cap companies, with the intriguing observation that the top three influential stocks were high-priced hundred shares, favored by Chinese investors [40]. In addition, complex network models have been employed to analyze the behavior of the Chinese equity market, while power-law models have been used to construct networks that reflect the stock market behavior. These network models help to capture the relationships between different stocks, identify clusters or communities of stocks that exhibit similar behavior, and analyze the overall topology of the market [41]. In the next section, we discuss the process of constructing the stock market network.

2 Methodology

2.1 Transformation and symbolization

To initiate the analysis, we transform the closing price data and traded volume data for the stocks under examination. This transformation entails computing the ratios of the closing prices and volumes. Through this process, we can derive logarithmic returns and volumes, a financial metric utilized to gauge changes in asset prices, and traded volumes over time

$$\begin{aligned} r_{t,i}=ln \frac{p_{t,i}}{p_{t-1,i}}, \end{aligned}$$
(1)

where \(p_{t,i}\) represents the stock price of the company i at time t

$$\begin{aligned} v_{t,i}=ln \frac{vol_{t,i}}{vol_{t-1,i}}, \end{aligned}$$
(2)

where \(vol_{t,i}\) is the traded volume of stock of the company i at time t.

In this study, we used symbolization method of data which is based on time-series analysis, with the aim of dividing the state space of stock return data into distinct segments, i.e., if the state space \(\Omega \) within \({\mathbb {R}}^2\), encompassing the values of log-returns and log-traded money, is subjected to transformation into \({\mathcal {S}} = \{1, 2, 3\} \subset {\mathbb {N}}\). However, each element within the state space \(\Omega \) is mapped to \({\mathcal {S}}\), which creates a new sequence of numerical values belonging to \({\mathcal {S}}\). This process aids in the simplification and categorization of the data for subsequent analysis. This partition plays a crucial role in our methodology which enables a more nuanced comprehension of the underlying patterns in financial time-series. The selection of a partitioning scheme is not arbitrary but rather carefully considered due to its impact on the resulting symbolic sequences. The specific scheme chosen is based on its ability to capture market dynamics. To utilize the symbolization approach, the initial step is to calculate 3-quantiles for individual companies. This involves dividing the sorted distribution of logarithmic returns into three equally sized groups, with each group containing one-third of the dataset.

To be more precise, \(r_{iT_1}\) represents the first 3-quantile, and correspondingly, \(r_{iT_2}\) represents the second 3-quantile in the context of the specific company i

$$\begin{aligned} S_{it}=\left\{ \begin{matrix} 1 \quad &{} r_{it}<r_{iT_1} \\ 2 \quad &{}r_{iT_1} \le r_{it}\le r_{iT_2} \\ 3 \quad &{} r_{it}>r_{iT_2} \\ \end{matrix}. \right. \end{aligned}$$
(3)

In Eq. 3, the three values denote the log return statuses pertaining to each stock index. The first value, 1, signifies a day of declining share prices. The second value, 2, denotes a day with relatively stable stock prices. The third value, 3, indicates a day of rising stock prices.

To transform the traded volume data for two companies, we follow this procedure:

  1. 1.

    Calculate the daily average return of the traded volume, \(\overline{v_{ij,t}}\), for each two companies, i and j at time t.

  2. 2.

    Compute the 3-quantiles for the daily average traded volume return for each company. This step involves dividing the sorted distribution of the traded volume returns average into three equal sections, each containing one-third of the data.

  3. 3.

    Define threshold values for the series of the daily traded volume return averages, denoted as \(v_{T_1}\) and \(v_{T_2}\), representing the first and second 3-quantiles for companies i and j.

  4. 4.

    Symbolize the return of traded volume using the following schemes:

    $$\begin{aligned} V_{ij,t}={\left\{ \begin{array}{ll} 1 &{} \text {if } \overline{v_{ij,t}}<v_{T_1} \\ 2 &{} \text {if } v_{T_1} \le \overline{v_{ij,t}} \le v_{T_2} \\ 3 &{} \text {if } \overline{v_{ij,t}}>v_{T_2}. \end{array}\right. } \end{aligned}$$
    (4)

In Eq. 4, the three values, 1, 2, and 3, represent the volume states associated with each company. The first value, 1, indicates low traded volume return, the second value, 2, represents moderate traded volume return, and the third value, 3, indicates high traded volume return between each two companies.

2.2 Partial mutual information-based distance (PMID)

In information theory, the concept of partial mutual information distance (PMID) is employed to quantify the statistical dependence between two variables while considering the impact of another variable. PMID measures the extent to which knowledge of one variable reduces uncertainty about another, while accounting for the impact of a third variable.

In analyzing stock prices, we consider variables X and Y as the returns of two companies, Company A and Company B. We also defined a variable Z, which represents the average volume of shares traded for both companies. Given this, we can define the PMID between share prices X and Y relative to volume Z as follows:

$$\begin{aligned} \text {PMID}(X; Y | Z) = \text {H}(X | Z) + \text {H}(Y | Z) - \text {H}(X, Y | Z),\nonumber \\ \end{aligned}$$
(5)

where

  • \(\text {H}(X | Z)\) represents the conditional entropy of the stock price return of company A given the average return traded volume of both companies A and B.

  • \(\text {H}(Y | Z)\) represents the conditional entropy of the stock price return of company B given the traded volume of both companies A and B.

  • \(\text {H}(X, Y | Z)\) represents the conditional joint entropy of the stock prices of companies A and B, given the average return traded volume of both companies A and B.

Essentially, this equation quantifies the interdependence of two companies’ stock prices, accounting for the impact of traded volume. It measures the degree to which these companies’ stock prices move in sync with changes in traded volume. PMID elucidates the extent of shared information between X and Y when analyzing their relationship with Z. It tells us how their joint entropy differs from their summed entropies when considering their association with Z. This helps to understand and reveal hidden patterns in the complex financial system.

In this article, we utilize the Schürmann–Grassberger estimator, a Bayesian parametric method used to evaluate Shannon’s entropy for practical purposes. This estimator specifically depends on the Dirichlet probability distribution.

2.3 Network construction

In this study, the Financial Stock Market Network is a process designed to uncover intricate relationships among financial assets within a stock market. This method uses historical daily price data and trading volume, a correlation threshold, and advanced network analysis techniques to create a visual and actionable representation of asset interactions. The algorithm calculates log-returns for each financial asset using historical price data and traded volume. Then, it initiates an empty network graph representing the asset relationships. By systematically evaluating pairs of assets, the correlation coefficients or similarity measures between their log-returns are calculated using PMID. If these coefficients exceed the predefined correlation threshold, nodes representing the assets are introduced into the network and edges weighted by the correlation coefficients are established between them. This process establishes a network that visually represents the interconnectedness of financial assets. To minimize extraneous information, this study employs the Minimum Spanning Tree (MST). To compute the weights of the edges, 10,000 MSTs are generated using the partial mutual information-based distance. Subsequently, the frequency ratio for each edge is determined and assigned as the edge thickness.

Upon constructing the network, a comprehensive analysis is extracted by employing different network characteristics. Centrality measures are utilized to identify assets with a significant influence, while community detection algorithms group similar assets, aiding the identification of distinct market segments. Network connectivity patterns are recognized by clustering coefficients. This comprehensive approach results in interpreting the findings that assist in identifying groups of assets with similar behavior, determining key assets based on centrality measures, and gaining a deeper comprehension of market trends and possible risk factors. Essentially, this approach is a potent instrument for uncovering the complex network of interconnections among financial markets, providing indispensable knowledge for informed decision-making. Algorithm 1 comprehensively outlines the entire process of network construction.

Algorithm 1
figure b

Financial stock market network construction

3 Results

3.1 Data splitting

This research employs a dataset that consists of the closing prices of the SSE 50 index, which is an important stock market index, in China. This index comprises the 50 companies listed on the Shanghai Stock Exchange (SSE). Our analysis focuses on the period from June 1 2019 to December 30 2020. By studying these data, we can gain insights into how these companies’ stock prices fluctuated during this time-frame. Figure 1 displays the data of the SSE 50 index from June 1 2019 to December 30 2020. It shows the daily patterns of the index, including the closing, highest, and lowest prices of the stocks included. The visualization uses a candlestick plot to highlight a period of volatile during this time-frame. This plot effectively illustrates how the markets volatility changed and depicts the fluctuations, in stock prices during this phase. By analyzing the SSE 50 index in detail, we can gain insights into China stock market performance as a whole and understand how market dynamics changes during the important periods.

The dataset of the SSE 50 index has been divided into three non-overlapped time-periods to analyze the market dynamics comprehensively. The first period, which spans from June 1 2019 to January 1 2020, acts as a reference point, for understanding how the market performed before disruptions occurred. The second period covers January 2 2020–June 30 2020, which represents a time of volatility with the significant fluctuations in returns. This split was chosen due to the volatility observed during this time-frame. Finally, the third period encompasses July 1 2020–December 30 2020, which allows for an analysis of the markets recovery and stabilization after experiencing the high volatility. By segmenting the data in this way, we can identify how the volatility impacted the SSE 50 index and gain insights, into how market volatility evolved over time and its subsequent effects.

Fig. 1
figure 1

The time-series of SSE 50 from 2019-06-01 to 2020-12-30 with close, high, and low prices, and a candlestick plot highlighting the volatile period

Figure 2 shows the variability of a randomly chosen set of stocks, including Shanghai Pudong Development Bank Banking SSE: 600000, China Petroleum Chemical Corporation Oil gas SSE: 600028, CITIC Securities Financial services SSE: 600030 and Sany Heavy Industries Industry SSE: 600031 from January 2019, to December 2020. It gives a picture of how these stocks prices changed over this time-period. This visual representation provides insights, into how these selected stocks performed and helps us understand the market dynamics that influenced their price movements during this period of time.

Fig. 2
figure 2

Fluctuations in the return of close prices for the selected stocks SSE: 600000, SSE: 600028, SSE: 600030, and SSE: 600031 over the time-period of January 2019–December 2020

3.2 Topological structure of constructed networks

The analysis of the Minimum Spanning Tree (MST) within the stock network reveals distinct patterns in the central nodes of the MST during stable, volatile, and follow-up periods, shedding light on the shifting priorities of the Chinese economy. During the stable period, the MST’s main nodes are centered around diverse sectors, representing different aspects of the Chinese economy. The presence of AECC Aviation Power, SSE: 600893, signifies a focus on aviation and power generation, highlighting the importance of aviation-related industries and energy supply. WuXi AppTec, SSE: 603259, is a key player in the pharmaceutical sector, reflecting the thriving pharmaceutical industry. China Tourism Group Duty Free Corporation, SSE: 601888, is associated with the tourism sector, indicating a thriving travel and tourism industry. China Petroleum & Chemical Corporation (Sinopec), traded as SSE: 600028, from the energy sector holds a pivotal importance in the oil and gas industry, playing a crucial role in fuel supply. Figure 3 depicts this period.

Fig. 3
figure 3

MST of stable period; thickness of edges shows the reliability of linkage

During the volatile period, the central hubs of the MST are shifting to sectors that enhance the resilience of the economy. CITIC Securities, a pillar of financial services, maintains its prominence, prioritizing the stability of the financial sector in volatile times. LONGi Green Energy Technology, SSE: 601012, emerges as a key player in the renewable energy sector, emphasizing the importance of sustainable energy sources in uncertain times. Zhejiang Huayou Cobalt, SSE: 603799, emphasizes the importance of resource management and sustainability in the mining industry, securing key resources for recovery. At the same time, China Communications Services Corporation Limited, a pioneer in telecommunications, emphasizes the critical role of digital connectivity in maintaining economic continuity. Hangzhou Hikvision Digital Technology Co. Ltd., representing technology, emphasizes innovation and adaptation as driving forces in managing volatility. In addition, Shanghai Construction Group Co. Ltd., a prominent player in the construction sector, signifies the central role of infrastructure development in driving economic recovery. This deliberates emphasis on the banking, renewable energy, resource management, telecommunications, and construction sectors which demonstrates a practical and all-encompassing tactic for overcoming economic obstacles and cultivating resilience. Figure 4 illustrates this time period.

Fig. 4
figure 4

MST of volatile period; thickness of edges shows the reliability of linkage

Following the volatile period, the primary nodes within the MST are once again focusing on sectors that are essential to economic recovery. CITIC Securities, a major player in financial services, emphasizes the resilience of the financial sector and its continued importance. Industrial and Commercial Bank of China (ICBC), a heavyweight in the banking sector, reiterates the continued importance of this sector in the follow-on scenario. Meanwhile, Will Semiconductor, SSE: 603501, plays a crucial role in advancing the technology sector’s contribution to driving recovery and innovation. Wingtech, SSE: 600745, is an excellent example of the technology sector’s continued contribution to economic rejuvenation. Foshan Haitian Flavouring & Food Co., SSE: 603288, represents domestic consumption within the Food & Beverage sector, which is a critical component of economic growth. In addition, the inclusion of China Communications Services Corporation Limited, representing telecommunications, highlights the ongoing digital transformation and its importance in maintaining connectivity during the recovery. This balanced focus on the banking, technology, food, and telecom sectors demonstrates a comprehensive approach to the post-volatility economic recovery. Figure 5 illustrates this time period.

Fig. 5
figure 5

MST of follow-up period; thickness of edges shows the reliability of linkage

Fig. 6
figure 6

Plot and fit of the power-law distribution for three periods

When comparing these three distinct periods, a clear pattern emerges, illustrating how the primary nodes within the MST adapt their focus in response to the prevailing economic conditions. In the stable era, the MST’s central hubs exhibited notable diversity across various sectors, with energy, tourism, and infrastructure featuring prominently. Prominent firms, such as AECC Aviation Power, WuXi AppTec, China Tourism Group Duty Free Corporation, and China Petroleum & Chemical Corporation (Sinopec), reflect a diverse economy with a firm focus on aviation, power production, pharmaceuticals, tourism, and energy sectors.

Nevertheless, amid the volatile period, the MST redirected its attention to industries that could ensure stability, adaptability, and resilient resurgence. Financial services, technology, and construction were prominent, with firms, such as CITIC Securities, LONGi Green Energy Technology, and Zhejiang Huayou Cobalt playing major parts in promoting stability, sustainable energy sources, resource management, and digital connectivity as crucial factors in coping with difficult times. In the aftermath of the volatility, a fresh group of sectors arose as crucial pillars for sustainable expansion and recuperation. Banking, resources, healthcare, and telecommunications were the main areas of focus, with companies, such as Industrial and Commercial Bank of China (ICBC), Will Semiconductor, Wingtech, and Foshan Haitian Flavouring & Food Co. contributing to a comprehensive approach toward economic revitalisation. This analysis emphasizes the economy’s ability to adapt and remain resilient by strategically prioritizing different sectors based on prevailing circumstances. This approach ensures overall stability and sustainable growth through various economic phases.

3.3 Power-law distribution analysis

Power law in networks is a mathematical distribution exhibiting a few nodes, or entities, with significantly more connections than the majority. In the context of financial network analysis, this means that a small number of financial entities have a disproportionate influence or connectivity within the network, while the majority have fewer connections. The power law implies that a small number of dominant financial entities can significantly affect the stability and operation of the financial network. Therefore, comprehending and surveilling these influential nodes become vital for risk assessment and management. Analyzing power-law distributions can provide insight into market trends, transmission mechanisms, and the overall behavior of the financial system. It also aids in deciphering network dynamics and formulating effective regulatory strategies to maintain a stable and secure financial ecosystem.

Boginski and colleagues [42] provided evidence indicating that the stock market exhibits characteristics consistent with a power-law distribution, while Minimum Spanning Trees (MST) display a scale-free structure. Specifically, the distribution of node degrees conforms to a power law, denoted as \(p_k \propto c k^{-\alpha }\), where \(p_k\) signifies the distribution of node degrees, \(\alpha \) represents the scaling parameter, and c denotes constants. By applying the maximum-likelihood estimator, the parameter \(\alpha \) can be estimated as follows for various time-frames: \(\alpha _{\text {stable}}\) = 1.5, \(\alpha _{\text {volatile}}\) = 1.45, and \(\alpha _{\text {follow-up}}\) = 1.48. As illustrated in Fig. 6, the node’s degree distribution forms a linear relationship with the node’s degree, confirming the presence of the power law.

Table 1 Node degree frequency for three periods of stable, volatile, and follow-up

3.4 Network characteristics analysis

The analysis of network characteristics spans three distinct levels: node level, cluster level, and global level. At the node level, individual elements within the network are scrutinized for their attributes, such as degree centrality, representing the number of connections a node has. This level offers insights into the influence and importance of specific nodes in terms of their connections. Moving to the cluster level, groups of nodes that exhibit higher interconnectedness are studied. These clusters reveal substructures within the network, aiding in the identification of cohesive units. Finally, the global level encapsulates the overall network properties, reflecting the efficiency of information flow and the network’s extent. Additionally, the presence of hubs, nodes with exceptionally high degrees, significantly impacts network resilience. By systematically analyzing these characteristics at multiple levels, a comprehensive understanding of the network’s structure, function, and potential vulnerabilities can be found.

3.4.1 Node level

The degree of a node refers to the number of edges incident to that node. In the context of an undirected graph, the degree of a node is simply the count of its adjacent nodes. Mathematically, if we denote the degree of a node v as deg(v), and the set of its adjacent nodes as N(v), then \(deg(v) = |N(v)|\), where |N(v)| denotes the cardinality of the set N(v). Node degree quantifies the edges linked to a company within the stock price network. Remarkably interconnected companies exhibit higher degrees, suggesting centrality in the network. Notably, the lowest node degree in a network is 1. However, in a minimum spanning tree (MST), the maximum degree reaches \(n - 1\), and in a complete network, it extends to n. The average degree in a tree, with a fixed number of edges \((n -1)\), is mathematically determined as \(2-2/n\).

The analysis of Table 1 shows trends in the distribution of node degrees over three time-periods. Specifically, all three periods demonstrate a node degree of 1, signifying nodes with a single link. It is of further concern that before and during the volatile period, node degrees are significantly higher at 20 and 23, suggesting a greater range of connections and potential complexity. During the volatile period, however, no nodes have degrees higher than 1, with a moderate degree of 3 becoming more prominent, indicating a consolidation of the network structure after the volatility. These patterns reflect changing market conditions and dynamics, with higher node degrees before and during volatility possibly indicating an interconnected market with increased trading activity and shifts in investor sentiment. The period after the volatility, with fewer high-degree nodes, may indicate a stable market environment with reduced volatility and potential adjustments in market relationships.

Node strength is associated with the sum of weights of the links connected to a node. In mathematical terms, for a node i, the node strength, \(S_i\), is defined as the sum of the weights, \(\delta _{ij}\), of the links connected to that node as follows:

$$\begin{aligned} S_i= \sum _j \delta _{ij}, \end{aligned}$$
(6)

where \(S_i\) is the node strength of node j represents the neighboring nodes connected to node represents the distances of the link between nodes which is constructed by Eq. 5.

This measure provides an indication of the overall influence or importance of a node within the network based on the strength of its connections. Node strength is calculated as the summation of the weights associated with the linked connected to a given company such as i the node strength is represented the distances established as per Eq. 5. Node strengths for each node exhibit fluctuations across the different periods. Notably, the node strength values tend to decrease over the transition from stable to follow-up period. As depicted in Fig. 7, in the stable period, node strengths range from approximately 0.12 to 2.78, reflecting a diverse distribution of connections and influence among nodes. During the volatile period, node strengths experience a broader range of values, spanning from around 0.09 to 3.32, indicating potential volatility and shifts in network dynamics.

Fig. 7
figure 7

Node strength plot for the three periods of stable, volatile, and follow-up

During follow-up period, the node strengths are relatively lower compared to the volatile period, with values ranging from approximately 0.08 to 1.52, suggesting a potential consolidation or recalibration of network interactions. The observed changes in node strength values over these periods may imply shifts in the importance or influence of certain nodes within the network. Specifically, nodes with higher node strengths could represent more central and influential entities in the network. The decrease in node strengths from volatile period to follow-up period may reflect changes in market dynamics, investor behaviors, or shifts in the overall connectivity structure.

Node eigenvalue measure pertains to the components of the principal eigenvector computed from the adjacency matrix of the nodes. It serves as the indicator of the significance or importance of a node, within a network. It offers insights into how each node contributes to the structure and connectivity of the network. Mathematically, the eigenvalues of the adjacency matrix A are solutions \(\lambda \) to the characteristic equation \(det(A-\lambda I) = 0\), where I is the identity matrix. The eigenvalues represent certain structural properties of the graph. As shown in Fig. 8, prior to the volatile period, eigenvalues spanned from 0.0016 to 0.2749 signifying varying degrees of node importance during that period. When the volatility occurred, eigenvalues ranged from 0.0002 to 0.2792 indicating changes in network influence at that time. Following the volatile period, there was a decline in eigenvalues compared to the volatile period ranging from 0.0086 to 0.4639, which suggests a stabilization and rearrangement of node importance within the network structure. Nodes possessing eigenvalues tend to hold positions and exert greater influence, on overall network behavior.

Fig. 8
figure 8

Eigenvalue plot for the three periods of stable, volatile, and follow-up

3.4.2 Cluster level

In a network, the closeness centrality score acts as an indicator that reflects how central a firm is. This score indicates that a firms proximity to all entities in the network is directly linked to its level of centrality. In terms the central a company becomes within the network, the closer it is to all other companies. This emphasizes the significance of units in terms of their ability to communicate and transact efficiently with a range of other units, thus highlighting their crucial role in overall connectivity and communication dynamics, within the network. The graph displayed in Fig. 9 demonstrates that in companies and time-periods exhibit variations, in their closeness centrality scores. Notably, during the volatile period, many firms show closeness centrality scores indicating their increased connectivity and proximity to other firms within the network. This could be attributed to heightened communication and interaction among these firms during times of instability.

Additionally, the follow-up period reveals a pattern with certain firms maintaining their high closeness centrality scores, while others experience a decline. This diversity may reflect changes in the dynamics of the network as it recovers from the volatile period. Interestingly, a few firms consistently maintain closeness centrality scores across different time-periods indicating their ongoing importance in fostering connectivity within the network. Therefore, analyzing closeness centrality scores provides insights into how firms contribute to connectivity and communication within the network in the cluster level. The observed variations in these scores over time offer information, about shifting dynamics during stable, volatile, and follow-up periods, describing how central firms influence the structure and functioning of the financial system.

Fig. 9
figure 9

Closeness centrality plot for the three periods of stable, volatile, and follow-up

The number of closed walks of length m starting and ending on node i in the network is given by the local spectral moments \(\delta _m(i)\), which are defined as the \(i^{th}\) diagonal entry of the mth power of the adjacency matrix, A, i.e., \(\delta _m(i)=(A^m)_{ii}\).

Then, we define subgraph centrality as follows:

$$\begin{aligned} sub(i)= \sum _{m=0}^{\infty }\frac{\delta _m(i)}{m!}. \end{aligned}$$
(7)

Figure 10 represents the subgraph centrality scores for different companies across three periods: stable, volatile, and follow-up periods. Subgraph centrality quantifies the number of closed walks of varying lengths that start and end on a given node within the network. These scores are determined using the local spectral moments, which are calculated based on the powers of the adjacency matrix.

After evaluating the three time-periods, it becomes clear that the subgraph centrality scores of the different companies show fluctuations. Specifically, some companies displayed significant increases in subgraph centrality during the volatile period, indicating increased participation in closed paths and connectivity within the network. This phenomenon could be attributed to their active involvement and influence within the financial system during turbulent times. During the follow-up period, different patterns in the subgraph centrality scores were evident. While some companies maintain or slightly increase their centrality, implying a persistent role in maintaining connections within the network, a few companies experience significant fluctuations in their subgraph centrality scores. This variation may indicate adaptations in their impact and involvement as the financial system recuperates from the turbulent phase and adjusts to new dynamics.

Additionally, some companies show significantly higher subgraph centrality scores than others, indicating their crucial role in maintaining closed walks and connectivity in the network. These companies may be viewed as cluster entities that significantly contribute to the overall structure and functioning of the financial system. Thus, subgraph centrality scores provide insights into the importance of firms in terms of their involvement in closed paths and connectivity within the financial network. The observed variations during stable, volatile, and follow-up periods highlight the network’s changing dynamics. The presence of highly centralized companies suggests their critical function in shaping the overall structure of the financial system and their potential influence on its stability and resilience.

Fig. 10
figure 10

Subgraph centrality plot for the three periods of stable, volatile, and follow-up

The betweenness centrality of a company, denoted as b(i) for company i, is defined as

$$\begin{aligned} b(i) = \sum \limits _{i \ne j \ne k} {\frac{{{\zeta _{jk}}(i)}}{{{\zeta _{_{jk}}}}}}, \end{aligned}$$
(8)

where \(\zeta _{jk}(i)\) is the number of paths from j to k that passes through i and \(\zeta _{jk}\) is the number of paths between companies j and k. This centrality metric acknowledges a company’s function as an intermediary connecting pairs of other companies within a stock market network.

Fig. 11
figure 11

Betweenness centrality plot for the three periods of stable, volatile, and follow-up periods

Figure 11 displays the betweenness centrality scores for companies during three time-periods. Betweenness centrality is a metric used to identify companies that act as bridges between paired companies in the stock market network. This measurement evaluates how many paths go through a company, which accentuates its significance in enabling connections and communication among entities in the network. After analyzing the MSTs, it is evident that the majority of companies do not possess betweenness centrality scores across the three periods. This implies that these companies do not serve as significant intermediaries or bridges between paired companies within the stock market network. Consequently, these companies are likely to have limited impact on the shaping of information flow, transactions, or interactions among entities within the network.

However, there are some exceptions with higher betweenness centrality scores, particularly during and after the volatile period. These particular firms act as intermediaries that facilitate communication and interactions among entities. Their betweenness centrality values indicate their role in maintaining connectivity and ensuring the proper functioning of the stock market network during turbulent times. This indicates that these companies can adjust to the shifting dynamics and challenges of the environment by assuming a crucial role in maintaining connections between other entities.

Moreover, there are a few standout companies with betweenness centrality scores especially during the follow-up period. These companies serve as bridges facilitating interactions and the flow of information among a range of other paired companies. Their impact on the stability and functioning of the network is particularly evident during this time.

3.4.3 Global level

The assessment of communication between a pair of companies within a stock price network conventionally revolves around identifying the shortest path that connects these companies. Nevertheless, it is important to note that global communicability transcends the consideration of solely the shortest paths facilitating communication between nodes p and q. It encompasses a broader spectrum of pathways, encompassing all possible trajectories that enable the transfer of information or entities from one company to another.

Suppose, \(N_{pq}^{(k)}\) as the count of shortest paths between nodes i and j with a specific length k, and \(W_{pq}^{(s)}\) as the count of walks connecting nodes i and j with a length greater than \(s>k\), we propose to define the following quantity:

$$\begin{aligned} C_{pq}=\frac{1}{k!}N_{pq}^{k}+\sum _{s>k}\frac{1}{s!}W_{ij}^{s}. \end{aligned}$$
(9)

By leveraging the relationship between the powers of the adjacency matrix and the count of walks within the network, we derive communicability score as follows:

$$\begin{aligned} C_{pq}= \sum _{s=0}^\infty \frac{(A^s)_{pq}}{k!}=(e^A)_{pq}. \end{aligned}$$
(10)

The assortativity coefficient evaluates a company’s inclination to associate with the other companies having comparable or dissimilar degrees. This metric is determined by computing the average degrees \(S_{nn}(s)\) of a company’s neighbors when the company itself has a degree of s. To compute this measure, we initially determine the average degree \(S_{nn}(p)\) of a company p’s neighboring companies

$$\begin{aligned} A_{nn}(p)=\frac{\sum _{(qp)}s_q}{n_q}. \end{aligned}$$
(11)

Here, q represents a neighboring node of company p. Subsequently, we calculate the average once more, this time considering all companies p that share the same degree s

$$\begin{aligned} A_{nn}(s)=\frac{\sum _{p:s_p=s} A_{nn}(p)}{n_s}, \end{aligned}$$
(12)

where \(n_s\) represents the count of companies possessing a degree of s.

As elucidated above, global network metrics provide a more comprehensive understanding of the behavior of Minimum Spanning Trees (MSTs) in stock prices across three time-frames. The degree of assortativity measures the extent of homophily within the network. A high coefficient implies that connected nodes tend to share similar attribute values. Specifically, the assortativity degree during the stable, volatile, and follow-up periods stands at 0.31, 0.44, and 0.32, respectively. This observation underscores that the volatile period exhibits a higher degree of assortativity compared to the other periods. Additionally, it is worth noting that the communicability patterns among companies differ significantly during the volatile period compared to the preceding and succeeding periods. For instance, during the volatile period, the communicability degree between Ping An Insurance (SSE: 601318) and China Petroleum Chemical Corporation (SSE: 600038) is 31, whereas it registers at 11 and 12 during the stable and follow-up periods, respectively.

Fig. 12
figure 12

Node attack plot of the percentage removal of nodes and corresponding maximally connected components

4 Discussion

4.1 Analysis of network resilience based on Markov Chains

To enhance the comprehension of symbolized data introduced in the Methodology section, we assume the existence of a time-homogeneous Markov chain denoted as \(\lbrace s_{t,i},t\ge 1\rbrace \). This Markov chain comprises states from the set \(\lbrace 1,2,3\rbrace \). The transition probabilities governing the transitions within this chain are represented as \(P({s_{t + 1,i}} = k|{s_{t,i}} = l) = P({s_{1,i}} = k|{s_{0,i}} = l) = p_{lk,i}\), where \(1 \le l,k \le 3\) for each individual stock indexed by i.

In the context of any ergodic Markov chain, as the number of time steps N approaches infinity, the value of \(P_{lk,i}^{N}\) converges to a certain limit which remains independent of the initial state l for each company i. This limit can be denoted as follows:

$$\begin{aligned} \mathop {\lim }\limits _{N \rightarrow \infty } P_{lk,i}^{N} = {\pi ^{(i)} _{k}} > 0. \end{aligned}$$
(13)

Here, the quantities \(0 \le {\pi ^{(i)}_{k}} \le 1\), linked to every company i, satisfy the subsequent set of steady-state equations

$$\begin{aligned}{} & {} {\pi ^{(i)} _{k}} = \sum \limits _{l = 1}^3 {{\pi ^{(i)}_{l}}{p_{lk,i}}},\quad \text {for }k = 1,2,3 \quad \text {and} \nonumber \\ {}{} & {} \quad \pi ^{(i)}_{1}+ \pi ^{(i)}_{2}+\pi ^{(i)}_{3}= 1. \end{aligned}$$
(14)

Thus, the set \(\lbrace \pi ^{(i)}_{k}, 1 \le k \le 3 \rbrace \) emerges as the exclusive stationary distribution. To illustrate, consider the transition matrix associated with Ping An Insurance (SSE: 601318)

$$\begin{aligned} {P_{(601318)}} = \begin{pmatrix} 0.43 &{} 0.12 &{} 0.22 \\ 0.15 &{} 0.32 &{} 0.08 \\ 0.2 &{} 0.02 &{} 0.25 \\ \end{pmatrix}. \end{aligned}$$

By the concept of the ergodic nature of the Markov chain, a stationary distribution for SSE: 601318, taking the form

$$\begin{aligned} \pi ^{(601318)} = (0.3,0.24,0.35). \end{aligned}$$

Moreover, the anticipated recurrence times can be evaluated via the equation \(\mu ^{(i)}_{kk}=1/\pi ^{(i)}_{k}\) for \(k=1,2,3\). Consequently, the expected recurrence times for SSE: 601318 can be expressed as

$$\begin{aligned} \mu ^{(601318)}_{ll} = (3.3,4,2.8). \end{aligned}$$

4.2 Maximal connected component (MCC) network resilience analysis

To evaluate the topological resilience of stock correlation networks, common techniques involve the use of node attack and edge attack strategies. A network is deemed robust against such attacks if its fundamental attributes, such as connectivity, remain relatively stable after the attack. In this research, we opt for a random node removal method to assess the effects of node attack on the characteristics of our network.

A financial network, denoted as C, is considered connected when there exists a path connecting any company to any other company within the network. In cases where the network is not connected, it can be broken down into multiple connected sub-networks, denoted as \(N'\), which are referred to as the connected components of N. \(N'\) represents a maximal connected component (MCC) within N; if \(N''\) is considered a sub-network of N, then \(N''\) is equal to \(N'\). The size of the MCC within a stock correlation network provides valuable insights into the overall network connectivity. To evaluate network stability, we examine the fluctuations in MCC size due to random node eliminations throughout stable, volatile, and follow-up periods.

Figure 12 illustrates the relationship between the percentage of randomly removed nodes and the corresponding sizes of the MCC in the stock correlation network across three periods. In the stable period, the MCC initially accounts for 31% of the network when 10% of nodes are removed, gradually decreasing as more nodes are removed. In the volatile period, the MCC size further diminishes, starting at 17% when 10% of nodes are removed and reaching a low point of 8% with 70% node removal. Conversely, during the follow-up period, the MCC size begins to recover, commencing at 28% with 10% node removal and gradually increasing with higher removal percentages. These findings indicate that the stock correlation network exhibits greater resilience to random node removals in stable and follow-up periods compared to the volatile period. The MCC size serves as a measure of network connectivity, with smaller sizes indicating more disruption caused by node removals.

During the stable period, the network experiences a gradual reduction in the MCC size with increasing node removal percentages, suggesting a relatively stable network with minimal impact from random removals. In contrast, the volatile period witnesses a more significant disruption in network connectivity due to node removals, indicating vulnerability during turbulent times. However, during the follow-up period, the network displays signs of recovery as the MCC size increases with higher removal percentages.

5 Conclusions

In this exploration of financial networks within the Chinese stock market from 2019 to 2021, our central aim was to rigorously analyze network attributes and resilience across various hierarchical levels: node, cluster, and global. Our research sought to unveil the ever-evolving nature of these financial networks, emphasizing their adaptability and vulnerabilities in the face of market dynamics.

In the framework of conducting a comprehensive analysis of financial networks in the Chinese stock market from 2019 to 2021, our main objective was to conduct a thorough analysis of network characteristics and their ability to withstand turbulence at different hierarchical levels, with a particular focus on the node, cluster, and global levels. The findings are marked by impartiality and objectivity, intending to offer valuable insights for the financial sector. This research highlights the flexible and resilient characteristics of these financial networks, revealing their capabilities and limitations in adapting to the changing market environment. In this study, we divided the time-period into three segments: stable, volatile, and follow-up time-periods. We then applied a symbolization method to convert stock return data and daily transaction volume into an applicable format. Next, we employed the PMID method to measure the distance between nodes and establish edges. This approach is essential, because financial markets exhibit non-linear behavior, necessitating non-linear methods to extract network characteristics. After constructing networks for each interval using the Minimum Spanning Tree (MST) method and filtering unnecessary information, we analyzed the time intervals in three categories: nodes, clusters, and the global levels with specific indicators for each level. Notably, all constructed networks conformed to a power-law distribution.

This research study yielded several key findings that illuminate the intricate dynamics of financial networks. At the node level, we determined the significance of individual nodes using metrics, such as degree centrality and node strength. These metrics uncovered the high dynamism of networks, with some companies rising in importance, while others declined over time. Crucially, networks demonstrated adaptability in stable market conditions, but also vulnerability during volatile period. Cluster-level analysis indicates that during times of market volatility, firms tend to assume more critical roles as connectors, becoming actively engaged in forming connections with other entities. This heightened engagement is reflected in increased centrality, which suggests that these firms play a central and influential role in the network during turbulent market conditions. On the global level, this analysis demonstrates that companies are more inclined to establish partnerships with counterparts possessing similar degrees of centrality during periods characterized by market volatility, in contrast to times of stability or follow-up periods. In simpler terms, when the market is experiencing volatility, firms tend to collaborate more with the other companies that occupy similar influential positions within the network.

To assess the resilience of the constructed networks, we applied a Markov chain analysis, which is a mathematical tool used to understand how systems evolve over time. Additionally, we focused on examining the maximal connected component (MCC) of the constructed networks. This component represents the largest group of interconnected entities within the network. Findings show that in the second observed period, it appears that the network was more vulnerable to volatility. This means that during this specific time-frame, the financial network exhibited a higher degree of instability, possibly characterized by a greater number of disruptions, disconnections, or fluctuations in the relationships between entities. In contrast, the network displayed greater resilience during the follow-up period. This resilience suggests that the financial markets were on a path to recovery during this time.

As financial systems continue to evolve, a comprehensive understanding of network behavior and resilience remains indispensable. The results underscore the importance of obtaining a comprehensive understanding of financial networks to guide risk management strategies. By utilizing these insights, financial institutions can strengthen their capacity to navigate financial uncertainties proficiently. Furthermore, our study indicates potential for further investigation into the interconnectedness of various financial networks.