1 Introduction

Financial markets all over the world witnessed sharp volatility during the ongoing COVID-19 pandemic. This is something the financial world had not seen earlier. Recessions in the last 75 years can usually be categorized as one of the three factors—financial bubbles, oil shocks, or policy mistakes. According to Deloitte U.S.A., the rapid economic deterioration of economies and stock markets amid the COVID-19 threat represents a new category: a global societal shock (Ranjen, 2020). We saw a sharp decline in the stock market from 19th February 2020 through 23rd March 2020. This crash was, however, recovered significantly quicker than expected, creating an almost ‘V’ shaped recovery.

The economic impact of pandemics has been studied earlier, such as how HIV/AIDS impacted the economy discussed by Haacker (2004), while Santaeulalia-Llopis (2008) focused on the impact of the HIV/AIDS pandemic on development. Yach et al. (2006) discussed the costs of the global growth of obesity and diabetes. All of them conclude that these pandemics bring economic disruption, loss of employment, and loss of foreign direct investment which resulted in a global recession in the case of the COVID-19 pandemic (Goodell, 2020) also. A black swan event that is not related to the pandemic but resulted in similar behavior in the financial market was the financial crisis of 2008–2009, as discussed by Grout and Zalewska (2016). Berkmen et al. (2012) studied how an economic crisis affects different countries. The study concluded that the countries having more leveraged domestic financial systems, stronger credit growth, and short-term debt suffer the most economy-wise. The recovery of the market has also been tried using mathematical tests (Yanglin et al. 2020).

The current pandemic crisis has forced many financial researchers to study its effects in a short time (Nicola et al. 2020; Zhang et al., 2020; Zaremba et al., 2020; Ali et al., 2020). Most of these studies use a statistical or mathematical modeling approach that sometimes is difficult to comprehend at first hand. We propose a novel method based on fractals, popularised by Benoit Mandelbrot (1982) in the last quarter of the last century that is becoming increasingly popular in all branches of sciences. With a little use of traditional statistical methods, we aim to investigate the impact of the novel coronavirus disease (COVID-19) on the financial markets worldwide using the driven iterated function system approach, which uses the concept of fractals and chaos game. Fractals have widely been used for the study of financial markets (Alves, 2019; Bianchi & Frezza, 2017; Kristoufek, 2013; Lux, 1998) but with complex methodologies. We use a simple and widely used concept of chaos game representation (CGR) of DNA sequences (Jeffrey, 1990; Almeida et al, 2001; Randhawa et al., 2020, Pratibha et al. 2020) and modify the CGR method to accommodate the financial data. The CGR method has been very successful in analyzing genome sequences but has not yet found applications in economics and finance. We, for the first time, use a modified CGR technique to analyze the financial market behavior and demonstrate its application to the index fund futures during the COVID-19 pandemic.

In the next section, the data and the methodology used in this study are described in detail, followed by the presentation and discussions of the results obtained from this study in Sect. 3. The major outcome of the study is highlighted in the conclusions Sect. 4.

2 Data and Methods

We describe a novel and simple approach to quantify the similarity/dissimilarity between two data sequences using a modified CGR. CGR is an iterative mapping technique to convert a time series of a given length into a single image based on the movement of a point controlled by the amplitudes in the time series. As the time series of different financial market indices differ in length, a direct comparison of the stock prices based solely on the CGR image is difficult. To make the CGR image length independent, we first convert it into a Percentage CGR plot (PC-plot) by plotting, at each pixel level, the percentage of the k-order frequencies in a data sequence, which, in our case, is the 1-min percentage variations in the index funds around the globe. Visually, the PC-plots and traditional CGR images are identical. Since we aim to study the effect of COVID-19 on the financial markets, it is important to compare the stock prices before the COVID-19 with those after the pandemic. The PC-plots contain the percentage density of occurrences of each subaddress in the CGR square. If we want to find the variation in CGR or PC-plots, we need to look at the difference of density variations. We propose a solution in the form of SP-plots. To quantify the similarity/dissimilarity between the two time periods, we introduce two new concepts, a subtraction percentage CGR plot (SP-plot) and the k-order proximity index (Pr). The PC-plots contain the percentage density of occurrences of each corresponding sub-address in the CGR sequences of the time series. If we want to analyze the differences in the behavior of these sequences, we need to look at the differences in density variations of each sub-square address. We propose a solution in the form of SP-plots. An SP-plot is obtained by subtracting the percentage points of respective k-orders in each sequence. The SP-plot consists of positive and negative values indicating the differences of k-orders percentage distribution between two series. The sum of the positive differences will always be equal to the sum of negative differences because the percentage loss at one sub-address will have to be compensated by the gain(s) at some other address(es). This sum is named as k-order proximity index (Pr), which represents the degree of similarity between the financial market variations during two time periods. The value of this proximity index will increase with the degree of dissimilarity between the two species. The value of this index also changes with the value of ‘k’ because the distribution of a specific k-length combination of variations (k-th order sub-addresses) in the market will change as ‘k’ changes.

2.1 CGR

A sequence X(k) can be considered as a string composed of A, B, C, and D, which represents the percentage change in the stock price every minute. We considered the following combination:

A – if the market falls more than 0.01% of the previous value (Large Fall, L.F.)

B – if the market falls less than 0.01% of the previous value (Moderate Fall, M.F.)

C – if the market gains less than 0.01% of the previous value (Moderate Gain, MG), and.

D – if the market gains more than 0.01% of the previous value (Large Gain, LG).

$$ X\left( k \right) \in \left\{ {A,B,C,D} \right\} $$
(1)

We consider a unit square U and name corners Ci (i = 1,2,3,4) as A, B, C, and D, respectively, which corresponds to the value of X(k) (Fig. 1a). The initial point P(0) is the midpoint of the square. Now the second point P(1) is the midpoint between P(0) and CX(1) and so on. In General, P(k) is plotted as the midpoint between P(k-1) and CX(k) (Jeffrey, 1990). This is called a chaos game. If the points in the series are truly random, then this game will ultimately fill the square else will produce a fractal (Fig. 2).

Fig. 1
figure 1

a First eight points of the 4-cornered chaos game (DACCBADC or 41332143). The points are labeled as they appear in the sequence. The rest of the points in the financial data sequence will also be plotted in the same way, and the resultant CGR image will be a distribution of such points without the dotted lines shown in the picture. b Example of the addresses as read in the CGR images. Each sub-square has a unique address, as shown. Levels 1, 2, 3, and 4 are shown in the diagram. Further levels have smaller squares with larger address (k-word) lengths

Fig. 2
figure 2

Chaos game representation of a random sequence. The upper-left, upper-right, lower-left, lower-right are the cases of the series length of 1000, 10,000, 100,000, and 1,000,000 points, respectively

An example for movement of points in CGR is shown with the first eight members of the data sequence DACCBADC in Fig. 1a. In terms of price changes, the sequence DACCBADC means LG (> 0.01%), LF (< − 0.01%), MG (between 0 and 0.01%), MG (between 0 and 0.01%), MF(between 0 and − 0.01%), LF(< − 0.01%), LG(> 0.01%), and MF(between 0 and − 0.01%).

After plotting the financial data sequence X in unit square U, the unit square is divided into 2 N × 2 N sub squares; each sub-square represents a unique sub-sequence of length k (k-order) which are called the address of the sub-squares (Fig. 1b). An example of addresses of the sub-squares for different orders followed in the sequence is given in Fig. 1b. The corresponding address in which a particular point falls is noted. Many points may end up in the same address sub-square, which determines the density of the points at a given address.

2.2 PC-Plots

To make these plots, the percentage of points plotted in a sub-square is calculated. This percentage value represents the intensity of points in each sub-square. After plotting points by CGR and dividing the unit square into 2 k x 2 k sub squares, each sub-square is color-filled based on the calculated intensity values. Figure 2 shows the PC plots for a random time series (Y) of different lengths.

2.3 SP-Plots and k-Order Proximity Index (Pr)

Subtraction plot between series 1 (s1) and series 2 (s2) is plotted as

$$ S_{s1 - s2} = Y_{s1} - Y_{s2} $$
(2)

where \(Y_{s1}\) and \(Y_{s2}\) are PC-plots of series 1 and series 2 respectively. From the subtraction plot S, the sum of all the positive numbers (also the sum of modulus of negative numbers) is a measure of similarity or dissimilarity (Pr) between two genetic sequences.

$$ Pr = \mathop \sum \limits_{j = 1}^{{2^{k} }} \mathop \sum \limits_{i = 1}^{{2^{k} }} Y_{ji} , {\text{where }}Y_{ji} \ge 0 $$
(3)

Or,

$$ Pr = \mathop \sum \limits_{j = 1}^{{2^{k} }} \mathop \sum \limits_{i = 1}^{{2^{k} }} \left| {Y_{ji} } \right| , {\text{where }}Y_{ji} < 0 $$
(4)

The proximity index, \(P_{r}\), is a cumulative measure of dissimilarity between the PC-plots of two time periods.

2.4 Data Set

We used per minute price variations of four futures indices, namely, DAX, CAC 40 (MX), NASDAQ (N.Q.), and DOW JONES (Y.M.), and two index funds, HANG SENG (H.S.) and NIFTY 50. Per-minute variations for 10 months (August 2019 to May 2020) were downloaded from BacktestMarket.com.

To compare the prices of the futures/indices before and during the pandemic, we created five bi-monthly time series (J, K, L, M, and N) for each of the six datasets. J represents the per-minute price variations during August 2019–September 2019, K represents October 2019–November 2019, L represents December 2019–January 2020 (the bullish period just before the pandemic), M represents February 2020–March 2020 (the pandemic was officially declared by WHO, and the crash occurred), and N represents April 2020–May 2020 (a dramatic ‘V’ shaped recovery through the ongoing pandemic). The datasets J, K, and L represent a relatively stable and dataset M and N as the highly volatile stock market.

Using the modified chaos game representation, we compare the stock markets using their PC-plots, SP-plots, and proximity indices (Pr). We also studied the most frequent and the least frequent addresses in each of the data sets, indicating the most repeated patterns and the forbidden patterns in the time series respectively.

3 Results and Discussions

The modified chaos game representation, along with the proximity index, was programmed in MATLAB and applied to the data series described in Sect. 2.4. PC-plots, SP-plots, and the proximity indices Pr are obtained for the five bi-monthly periods J, K, L, M, and N as defined above. Some of the representative plots are shown here (Fig. 3).

Fig. 3
figure 3

HangSeng index variations (upper panel) and PC-plots (lower panel) before and during the pandemic. Note strong A-D diagonals in two rightmost PC-plots indicating the relative volatility of the market during the COVID-19

3.1 PC-Plots and SP-Plots of Financial Markets

Figure 3 shows the 1-min variations in the HangSeng index (upper panel) along with its PC-plots (lower panel) over five bi-monthly periods. The first three PC-plots are for the period before the COVID-19, and the last two plots are for the aftermath period. The vertices of these squares in these plots are defined in Fig. 1a. The NW–SE diagonal (between vertex B and C) in each plot represents moderate percentage variations in the index, and a NE-SW diagonal (between A and D) represents volatility of the respective market as the market is moving between very low(A) to very high (D) prices, or vice versa, in succession.

The PC-plot for period J (August–September, 2019) shows relatively high volatility in the HangSeng index as compared to periods K (October–November, 2019) and L(December 2019–January 2020). In January 2020, The SARS-COV-2 was reported in Wuhan, China and almost all of the international markets crashed in period M (February–March, 2020). The markets have since then recovered remarkably during the period N (April–May 2020) though we still do not know if it is a pseudo recovery. Therefore, for comparison purposes, we choose period L as the benchmark, and all the other bi-monthly periods are compared relative to period L. The subtraction-percentage plots (SP-plots) were then obtained in comparison with the period L for all the markets. A representative collage of PC-plots and SP-plots is shown in Fig. 4 for the NIFTY 50 index from India. The upper panel in Fig. 4 is the PC-plots for five bi-monthly periods. In the lower panel, four SP-plots represent a difference of market movement during periods J, K, M, and N relative to period L. The strong A-D diagonals in all the SP-plots are an excellent indicator of the relative changes in the market in other periods in comparison with the market during period L. The almost uniform distribution of points in the PC-plot during the period L in NIFTY is an indicator of a healthy market depicting all possibilities with equal probability.

Fig. 4
figure 4

NIFTY PC-plots (upper panel) and SP-plots (lower panel) before and during the pandemic. Note strong A-D diagonals in all the SP-plots, although the blue diagonals are much more prominent after the pandemic

3.2 Analysis of 1-Minute Futures

Four international market 1-min futures, namely, DAX, CAC 40 (MX), NASDAQ (NQ), and DOW JONES (YM), were also studied using this method. We computed the PC-plots and SP-plots for all these markets and plotted them in 2- as well as 3-dimensions. In Fig. 5, we show the 3-d PC-plots representing the percentage density at each corresponding address in the plot—the market futures (1-min variations) before the COVID-19 in period L show moderate variations as expected. An exciting feature during this period is that DAX and MX do show some volatility during this time also, as indicated through their high amplitude A-D diagonals, albeit for different reasons than COVID-19. However, the US markets, NASDAQ and Dow Jones remain moderate as indicated by the high amplitude B-C diagonals. The turbulent times start in mid-February, perfectly depicted in the PC-plots of period M. All plots during this period show a strong high-fall high-gain (A-D) diagonals with minor pilferage towards moderate variations (vertices B and C) in the market. The recovery phase (period N) shows high A-D diagonals also but a bit less in magnitude as compared to the period M. A dramatic recovery is seen in the MX market, where the recovery is dominated by an outstanding A-D diagonal, and all other minor market variations are overshadowed.

Fig. 5
figure 5

Percentage density in PC-plots for 1-min market futures. Four markets, Germany (DAX), France (MX), and United States (NQ and YM), are shown here for a stable period (L) and two chaotic periods, M (bearish) and N (bullish). The vertical axes in the plots represent the percentage density points of each address (order)

We consider period L, just before the pandemic as the reference period. We computed the proximity indices of L-J, L-K, L-M., and L-N for each of the six markets (Fig. 6).

Fig. 6
figure 6

4-order proximity index (Pr) from SP-plots for periods L–J, L–K, L–M, and L–N for all markets. The numbers on the horizontal axis represent a particular market, namely 2 for DAX, 3 for HS, 4 for MX, 5 for NIFTY, 6 for NASDAQ, and 7 for Dow Jones (YM)

3.3 Proximity Index and Most Occurring Combinations

Pr is a very good indicator of the level of market variations. Figure 6 demonstrates the high volatility of various international markets after the COVID-19 pandemic declaration. It also quantifies the magnitude of the relative variation of any market in comparison with a calm period. In Fig. 6, we observe that the French market had the highest volatility, both during the bearish period M and the bullish period N. It was much more volatile during the recovery phase (L4–N4 in Fig. 6). The period K (October–November 2019) has been almost similar to the period L in all international markets as evident from their low proximity indices. Except for the French market MX, the bearish and the bullish phases of other markets were of similar magnitude as their Pr values are similar.

Although the values of Pr are a good indicator of the relative variations of the market, it does not distinguish between a bull and a bear. To overcome this limitation, we looked at the five most occurring addresses in a particular PC-plot (Table 1). These addresses indicate the trend of the market during a particular period for which the PC-plot is drawn.

Table 1 List of five most occurring 4-order addresses in different markets before and after (bearish and bullish period) the COVID-19 pandemic

A combination ‘CBBC’ indicates a low rise, low fall, low fall, and low rise in successive order; whereas a ‘DDAD’ means a high rise, high rise, high fall, and high rise in the market. As we see in Table 1, period M in all the markets sees a more occurrence of A’s (bears) as compared to D’s, whereas during period N the situation reverses. We observe more D’s (bulls) than A’s.

On the other hand, the least frequent addresses give us an indication of those sub-sequences in the time series that are least probable.

Thus from the modified chaos game representations of the international markets, we see the effect of the global societal shock during the pandemic (Renjen, 2020). We also observe an equally strong recovery (sometimes even stronger) in the market which can be quantified in terms of the proximity index.

4 Conclusions

This study presents a novel way of using a fractal method, earlier used in genomic analysis, to analyze and quantitatively compare the financial markets across the globe. It is just another way of looking at and analyzing the financial data series with which a lot of new information can be visualized. We quantified the calm, bearish, and bullish effects of the pandemic on international markets. The SP-plots and the proximity indices can be used, in general, to compare two stocks or markets. The PC-plots and most frequently occurring addresses in a market may be utilized for detailed analysis and predicting the most probable occurrence of a pattern in the market.