1 Introduction

Accurate prediction of rainfall is one of the most challenging tasks for the hydrologists and meteorologists, due to the system complexity. Researchers have been studying many climatic drivers that have an impact on the behavior of rainfall pattern in India over the past century (Blanford 1884; Walker 1923). El-Niño southern oscillation (ENSO), quasi-biennial oscillation (QBO), Pacific decadal oscillation (PDO), Atlantic multi-decadal oscillation (AMO), Indian Ocean Dipole (IOD), Equatorial Indian Ocean oscillation (EQUINOO), etc. were recognized to be some of the well-known large-scale climate oscillations operating at timescales varying from inter-annual to inter-decadal scales, influencing the rainfall pattern in India (Kripalani and Kulkarni 1997a, b; Saji et al. 1999; Krishnan and Sugi 2003; Gadgil et al. 2004; Kumar et al. 2006; Goswami et al. 2006). Madden–Julian oscillation (MJO) is an intra-seasonal climatic oscillation with typical periodicity of 30–60 days, discovered in tropical regions in 1971 by Roland Madden and Paul Julian (Madden and Julian 1971, 1972, 1994).

The inextricable relationship between the Indian Summer Monsoon Rainfall (ISMR) and MJO was observed by researchers in the early 1970s (Murakami 1976). Since its recognition, many studies have been conducted on MJO, its underlying mechanisms, evolution, propagation characteristics, interannual variability (Yasunari 1979, 1980, 1981; Singh et al. 1992; Li et al. 2014; Chen and Wang 2018a,b; Wang et al. 2018). The effects of MJO on various atmospheric variables like cloudiness, zonal wind, latent flux, etc. were also discovered (Krishnamurti et al. 1988). Zhang (2013) studied the influence of MJO on precipitation, cyclones, flood, lightning, monsoons and some of the large-scale climatic oscillations. Chakraborty and Krishnamurti (2003) employed the relationship between MJO and ENSO with ISMR and concluded that during the monsoon period in India, high MJO signals and low ENSO signals result in an above average normal rainfall whereas low MJO signals and high ENSO signals results in below average rainfall. Saith and Slingo (2006) reported that MJO had a dominant effect in the occurrence of deficit rainfall in India in 2002. As MJO modulates the behavior of climatic oscillations, one cannot ignore its impact on the behavior of complex Indian rainfall system. Seetharam (2008) correlated the monthly MJO with ISMR of 1979–2000 period at 29 meteorological subdivisions. The study observed spatial diversity in the influence of MJO on rainfall in different subdivisions and linked this behavior with the 10 different MJO phases defined based on longitudes. Some of the researchers provided scientific evidences of the role of intensity different phases of MJO on the rainfall pattern of India and attributed the rainfall anomalies during the various MJO phases with the moisture convergence anomalies (Pai et al. 2011; Mishra et al. 2017). Many of the past studies also investigated the influence of MJO on the onset and retreat, seasonal and diurnal characteristics, regional rainfall extremes, etc. (Bhatla et al. 2017; Anandh et al. 2018; Singh and Bhatla 2018, 2020; Anandh and Vissa 2020). However, the earlier studies have paid little attention in extracting the MJO-rainfall teleconnections in multiple timescales, and such an analysis may help to capture new evidences that contribute beneficially to rainfall predictions. Therefore, improved understanding of MJO may help for improving the accuracy of rainfall predictions in the Indian subcontinent.

Wavelet analysis has been used for investigating the climatic teleconnections of rainfall in multiple timescales at different parts of the globe (Ashok and Saji 2007; Narasimha et al. 2010; Gaughan et al. 2016; Araghi et al. 2017). Even though wavelet transforms can satisfactorily handle the complex nonlinear and non-stationary signals, choosing the appropriate wavelet function and the level of decomposition are tedious tasks. Addressing this challenge, Huang et al. (1998) developed Hilbert–Huang transform (HHT) method by proposing a data-adaptive multiscale decomposition method namely empirical mode decomposition (EMD) and subsequently integrating it with the Hilbert transform (HT). In EMD, each signal can be decomposed into a set of zero mean components called intrinsic mode functions (IMFs) and final residue, each with definite periodicity. Iyengar and Raghu Kanth (2005) investigated the relationships of ENSO, QBO, tidal forcing and sunspot cycles with ISMR using EMD, by finding the overall correlation between modes with similar periodicities. But such a comparison can deliver only limited information on the teleconnections between 2 signals, as the strength of the association may vary over both the timescale and time domain when the processes are multiscale and signals are non-stationary. A running correlation analysis that accounts for the non-stationary and multiscale behavior of the time series can be a feasible solution to this issue. Chen et al. (2010) proposed HHT-based multiscale running correlation procedure, namely time-dependent intrinsic correlation (TDIC) to explore the relationship between two non-stationary time series. The TDIC method can capture the multiscale association through EMD and subsequent running correlation operation and it can address the issue of non-stationarity by choosing the most appropriate size for sliding window. The technique was implemented successfully for teleconnection studies including hydro-climatic teleconnections (Huang and Schmitt 2014; Ismail et al. 2015; Adarsh and Janga Reddy 2016, 2018; Johny et al. 2019, Johny et al. 2020b).

From the review of literature, it is well evident that none of the past studies investigated the influence of MJO on monthly rainfall over India in multiple timescales in a time–frequency space, even though capturing such scale-specific information may help in improved rainfall predictions. In developing the statistical or data-driven models for rainfall predictions, identification of significant inputs is one of the crucial step. It is well understood that the rainfall of the present month need not be influenced by the climatic oscillations of the same month, instead it can be a concurrent effect of the oscillations of some of the past months (Maity and Nagesh Kumar 2008). The knowledge of such lagged influence of each climatic oscillation that modulates the variability in rainfall is expected to improve predictions. In decomposition-based hybrid models for predictions, the time series is decomposed to a set of modes (components), each mode will be predicted separately using linear or nonlinear regression methods and finally, the predicted components are aggregated. Therefore, retaining only the most relevant modes and most significant lagged inputs for their predictions may considerably reduce the computational complexity of rainfall predictions using statistical or data-driven methods. In the past teleconnection studies of rainfall employing TDIC method, the lagged effects of climatic oscillations are not accounted. Time-dependent intrinsic cross-correlation (TDICC) proposed by Chen et al. (2010) is an extension of TDIC to capture such lagged influences, but its potential is not explored yet to capture the hydro-climatic teleconnections. Addressing the above research gaps, the current work attempts (i) to examine the teleconnection of MJO on the monthly rainfall pattern in India using the TDIC method; (ii) to investigate the influence of lagged values of predictor variables on monthly rainfall over India in multiple timescales, using the proposed EEMD-TDICC-coupled framework.

2 Methods and materials

EMD proposed by Huang et al. (1998) decomposes a signal into a number of orthogonal zero mean components called IMFs and a final residue. Each mode generated by decomposition is associated with specific periodicity, the lower-order modes are high-frequency modes (with shorter periodicity) and higher-order modes are low-frequency modes with longer periodicity. The decomposition process is purely data adaptive and unlike the popular discrete wavelet transform, the number of modes generated need not be specified a priori and the periodicity of successive modes need not be in dyadic powers (as power of 2). The EMD operation comprises (i) identification of peaks and troughs of the signal; (ii) fitting of envelop curves through the peaks using cubic spline and determine its mean; (iii) subtraction of mean from the signal. These steps are performed iteratively (called as sifting) till a zero mean signal is obtained, which are called as IMF. By subtracting the first IMF from the signal, the new signal can be obtained and the sifting can be continued. The process will be continued till a monotonic function is obtained, which is the final residue. More details of the algorithm can be found in the literature (Huang et al. 1998; Huang and Wu 2008). The original variant of EMD has serious shortcomings, as multiple frequency components may be associated with same mode or similar frequency components may present in more than one mode (so-called mode-mixing). The mathematical transformations of such modes may result in negative frequency, which are having no physical meaning and it may lead to wrong conclusions while applying to real field datasets. To circumvent this issue, a number of improvisations of EMD were proposed in the past and such algorithms were found to be suitable for practical applications in hydrology and meteorology (Adarsh and Reddy 2021).

2.1 Ensemble empirical mode decomposition (EEMD)

Wu and Huang (2009) proposed a multiscale noise-assisted variant of EMD called ensemble empirical mode decomposition (EEMD) that is capable of alleviating the mode-mixing problem on generating IMFs. The steps for executing EEMD are: (i) generate artificial signals from the given signal by adding white noise signal; (ii) extract the IMFs by employing EMD of each artificial signal; (iii) obtain the desired IMF by the method of ensemble averaging. More details of the algorithm and details on selection of its control parameters can be found in the literature (Wu and Huang 2005; Huang and Wu 2008).

2.2 Time-dependent intrinsic cross-correlation (TDICC)

An adaptive correlation analysis can be performed on given two signals using HHT-based data-adaptive TDICC technique (Chen et al. 2010). This method considers two × series x1(t) and x2(t) and their decomposition using EMD or its variants. The IMFs obtained are subjected to HT to get instantaneous frequencies (hence instantaneous periods) in a time–frequency space. TDICC accounts time lags while employing running correlation between IMFs of the signals in different timescales. In this method, the size of the sliding window at each instant is fixed as maximum of the instantaneous periods of the IMFs, computed from HT, which ensure the stationarity within the sliding window. The moving window analysis is performed iteratively till the end of the signal gets reached. The complete procedure of the algorithm is presented as a flowchart in Fig. 1. In the flowchart, x1(t) and x2(t) are two time series, c1i(t) and c2i(t) are IMFs of signals where t represents time whose value can change from 1 to the length of time series (N); \({t}_{d}\) represents minimum sliding window size; \({t}_{w}^{n}\) represents size of sliding window T1i and T2i are the instantaneous periods; \({t}_{k}\) represents any instant of time, in which k can vary from 1 to N; τ is the lag used to determine the lead–lag correlation; n is any positive number and normally selected as 1 (Huang and Schmitt 2014).

Fig.1
figure 1

Flowchart of TDICC method

In this procedure, the cross-correlations are computed for a large number of combinations of timescale and time instants along the time domain. As a result, a TDICC matrix will be obtained, which will be in a triangular shape with time in the x-axis and the size of the moving window in the y-axis, when represented graphically. The instantaneous cross-correlations can be identified from the color bar representation. The correlation coefficient between the modes with complete data length is equal to the correlation coefficient at the apex point of the triangle (Chen et al. 2010).

2.3 Proposed methodology

A realistic implementation that executes a running correlation between rainfall and MJO at monthly scale using TDICC analysis is followed in this study. Initially, a general correlation analysis is performed for each pair of the IMFs of MJO indices and monthly rainfall over India to investigate a multiscale hydro-climatic teleconnection. Subsequently, the TDIC analysis is performed to identify the prominent modes required to develop the rainfall prediction models. Finally, the TDICC analysis is applied on the prominent modes, to identify significant predictors (lagged values) to predict rainfall at each timescale (i.e., to predict each IMF component). It is worth to mention that the final aggregation of predicted IMF components and residue will provide the information on rainfall at time step t.

The steps to be followed in the approach are:

  1. 1.

    Generate the IMFs of the monthly time series of rainfall and MJO index using the EEMD method.

  2. 2.

    Identify the correlation coefficient between the components of rainfall with components MJO indices by employing Pearson correlation analysis to infer the association between different pairs.

  3. 3.

    Identify the significant IMFs by using TDIC analysis performed on IMFs of similar periodicities.

  4. 4.

    Perform TDICC between the IMFs selected in step 3.

  5. 5.

    For each significant IMF obtained in step (4), select the TDICC plot, indicating a significant correlation in long term and use the corresponding lags for the rainfall prediction of the corresponding timescale.

3 Study area and dataset

Indian Institute of Tropical Meteorology (IITM) Pune established a widespread network of rain gauge stations to measure the rainfall over India, in the 1990s. Considering rainfall homogeneity, IITM Pune demarcated 29 meteorological subdivisions in India and based on the data of 306 rain gauge stations, Parthasarathy et al. (1994) published monthly area weighted rainfall data of India. An updated version of this database available in Kothawale and Rajeevan (2017) is used in the present study. For this study, All-India (considering Indian main land as a single unit) monthly rainfall data for 39 years (1978–2016) are retrieved from the website of IITM Pune (http://www.tropmet.res.in). In order to examine the teleconnection of the MJO with monthly rainfall of All-India spatial scale, the MJO indices for ten time-lagged longitudes, namely, index-1, index-2, index-3, index-4, index-5, index-6, index-7, index-8, index-9 and index-10 at longitudes 80° E, 100° E, 120° E, 140° E, 160° E, 120° W, 40° W, 10° W, 20° E and 70° E, respectively, for the period 1978–2016 were obtained from Climate Prediction Center (CPC), NOAA datasets (https://www.cpc.ncep.noaa.gov). The hydro-climatic teleconnection studies can give proper insight only at larger spatiotemporal scales and it is advisable to perform such analysis at larger spatiotemporal scales (Kashid and Maity 2012). Many of the past studies considered monthly to seasonal scale aggregation of daily time series of MJO indices followed by averaging operation (Seetharam 2008; Li et al. 2018; Klotzbach et al. 2019; Dasgupta et al. 2020; Soria 2021). Accordingly, the monthly mean MJO indices derived from daily data aggregation for the period 1978–2016 were used for the teleconnection study.

4 Results and discussion

The multiscale decomposition of the rainfall or climatic oscillations will decipher the physical processes behind them. The modes obtained by decomposition will be with specific periodic scales. Firstly, EEMD is applied on all the ten indices, by setting the number of iterations, noise standard deviation and ensemble number as 10, 0.02 and 100, respectively, following the recommendations in the past studies (Beltr´an-Castro et al. 2003; Huang and Wu 2008; Wu and Huang 2009). We confirmed the evolution of distinctly separable and good quality modes without any mode-mixing for this combination of parameters, through a number of numerical experiments performed upon similar datasets, considering different control parameter sets (Johny et al. 2019, 2020). EEMD performed on index-7 and index-10 resulted in 9 IMFs and a residue. For all the other indices and the monthly rainfall data, EEMD resulted in eight IMFs and a residue as presented in Fig. 2. The mean periods of the modes obtained by decomposition of different signals are presented in Table 1. It can be seen that mean period of different signals in non-dyadic powers from bi-monthly scale to inter-decadal scales indicates multiscaling behavior. Due to the non-dyadic powers, the number of modes may not be same for all the signals (for index 7 and 10 it is 9 IMFs), which is also depending on the data complexity. The IMF3 is representing annual periodicity in all cases (varies from 11.32 to 12.81 months) for different signals.

Fig. 2
figure 2

Orthogonal modes of different climatic indices: a MJO index-1, b MJO index-2, c MJO index-3, d MJO index-4, e MJO index-5, f MJO index-6, g MJO index-7, h MJO index-8, i MJO index-9 and j MJO index-10

Table 1 Mean period T (in months) explained by different modes of climatic indices

To investigate the link between MJO and rainfall, first the cross-correlation between the modes of monthly rainfall and that of MJO indices are computed (Supplementary file Table S1). From the cross-correlation analysis, it is clear that certain MJO indices with rainfall are correlated differently for different IMFs and for different indices. MJO indices 1, 2 and 10 show a positive correlation and MJO indices 4 to7 show a negative correlation, for all the IMFs. Strongest correlations (> 0.9) are observed for the MJO index-1. For indices 3, 8 and 9, the correlations for most of the IMFs are primarily weak and the MJO-rainfall link retains strong correlations only in a very few IMFs. In general, low-frequency modes will always sustain more stable relationship than high-frequency modes for the rainfall-climate oscillation teleconnections (Adarsh and Janga Reddy 2016). The strong (or weak) correlation between MJO and the rainfall need not be global (signal as a whole) in nature, but a local (part of the signal) one, which may vary with the time spells, i.e., we intend to demonstrate that the estimation of overall correlation will not be sufficient to capture the scale-dependent association between MJO and rainfall. At some process scale or time spell, it will be positive, while at some other scale/spell it will be negative, which mutually cancels each other and eventually leads to very small overall correlation between the two series. Such information will be misleading and because of this we need to follow a running correlation approach in multiscale teleconnection studies. To capture such evolution of the pattern of correlation, the TDIC method is helpful and the TDIC analysis is executed between the corresponding components of rainfall and MJO index to identify the relevant set of IMFs for rainfall prediction. Finally, in order to identify prominent lags influencing the rainfall at different process scales, the TDICC analysis is performed. The observations obtained from the multiscale correlation analysis are provided below and the results of TDICC analysis showing the most relevant lags for each relevant IMF are summarized in Table 2.

Table 2 The relevant IMFs and significant lags obtained for TDICC analysis of monthly rainfall (1978–2016) and orthogonal components of MJO indices at different longitudes

4.1 MJO index-1

The TDIC analysis of MJO index-1 and monthly rainfall of India as presented in Fig. 3 shows strong long-range correlations for IMF1 and IMF5. In IMF1, the association is primarily positive, while in IMF5 it is negative. For remaining IMFs, one can notice multiple transitions in the nature of correlations from positive to negative (and vice versa) over the time domain. In short, more consistent pattern of association between MJO1 and rainfall is noticed in IMF1 and IMF5. Therefore, these two IMFs are chosen to perform the TDICC analysis, in order to understand the lag-effect of MJO on rainfall (Fig. 4). Even though IMF1 in TDIC plot (Fig. 3) shows a fairly strong long-range positive correlation, i.e., the strong and positive correlation is prevailed for all the timescale. All the lagged correlations of IMF1 on rainfall are found to be practically weak in this case. The correlations are weak and insignificant in different time spells and over the time domain in all the lags, except for lag 7 (Fig. 4). The pattern of correlations is stable for lag 7 and which is sufficient to be considered as input for modeling of IMF1, i.e., for modeling rainfall process at the high-frequency space. In short, IMF1(t) can be predicted by considering IMF(t-7) as input. IMF5 follows a strong long-range negative correlation in TDIC analysis (Fig. 3). From TDICC analysis of IMF5 for different lags, it is noted that (Fig. 4) there exists a long-range negative correlation for lags 1 to 4 and with more stable pattern for lag 2, followed by lag 1. The transition from negative to positive correlations is evident in lags 5–10with higher percentage of void spaces (insignificant correlations). This indicates more unstable and inconsistent role of MJO index on rainfall pattern at this process scale with these lags. A stable (unchanging) correlation pattern with time spells and over the time domain is brought back in lag 11. Hence, lags 1, 2 and 11 may be considered as potential predictors for IMF5.

Fig. 3
figure 3figure 3

TDIC plots of different MJO indices: a MJO index-1, b MJO index-2, c MJO index-6, d MJO index-7 and e MJO index-8

Fig. 4
figure 4figure 4

TDICC analysis between MJO index-1 and rainfall: a IMF1, b IMF2, c IMF3, d IMF4, e IMF5 and f IMF6

4.2 MJO index-2

TDIC plot of MJO index-2 depicts strong long-range negative correlation in all the six IMFs (Fig. 3), which emphasis the relevance of all IMFs of MJO index-2 for the prediction of monthly rainfall. To get an improved perception of the role of influences of different IMFs of MJOindex-2, TDICC is performed (Fig. 5). The long-range positive correlation (between 0.25 and 0.5) is noticed only at lag 1 and therefore IMF1 (t-1) is sufficient for the prediction of IMF1 of MJO index-2. Similarly, positive long-range correlation is noted for lag 2 and 3 and these lags are sufficient for prediction of monthly rainfall at the second process scale. In the case of IMF3, the first two lags maintain a long-range negative correlation while lags 5–8 maintain a long-range positive correlation, with differences in the magnitudes of correlations. ForIMF4, first two lags bear a negative correlation while the lags 9 to 12 bear positive associations with the respective component of rainfall. For prediction of IMF5, all the lags up to 10 maintain strong negative correlation, while first four lags are sufficient for prediction of IMF6. The MJO index-2 and rainfall relation was positive at some of the lags, negative at some other lags in different high-frequency IMFs, which are associated with transitions in the nature /strength of their associations over the time domain. But for low-frequency modes, the MJO2-rainfall relation was found to be practically unchanging.

Fig. 5
figure 5figure 5

TDICC analysis between MJO index-2 and rainfall: a IMF1, b IMF2, c IMF3, d IMF4, e IMF5 and f IMF6

4.3 MJO index-3

TDIC analysis of MJO index-3 shows a strong negative correlation in all IMFs revealing that all IMFs are relevant for monthly rainfall prediction. TDICC analysis performed on different IMFs of MJO index-3 showed that similar conclusions and interpretations as that of MJO index-2 can be drawn in this case also.

4.4 MJO index-4

Like for the previous two indices, all the six IMFs are found to be relevant with strong negative association for index-4. The overall pattern of correlations and subsequent interpretations for all the IMFs are found to be similar to that of the previous two cases except slight differences for IMF2 (Supplementary file Fig. S1). For IMF2, there exists a strong long-range negative association for lag 1, in addition to the stated positive associations of lags 2 and 3 of the above two indices. On examining the correlations in-depth, it is noted that the strength of correlations is very high for 1MF6 when compared with that of IMF5 (Supplementary file Figs. S2 and S3). This behavior is in contrary to that of the two low-frequency modes of indices 2 and 3, where a higher magnitude of correlation was noted for IMF5 than that for IMF6. Further, it was noted that the correlation pattern was very stable and unchanging with different lags for IMF6. But for IMF5, the strength of correlations was found to be diminishing with lag number.

4.5 MJO index-5

TDIC plot of MJO index-5 also indicted a strong negative association of its modes with that of rainfall. The lag-dependent behavior of different IMFs was found to be quite similar to that of MJOindex-4.

4.6 MJO index-6

Figure 3 depicts that IMF1 and IMF6 bear a strong long-range significant negative correlation pattern. A dominancy of positive correlation is noticed for IMF5 and a transition in correlation from negative to positive (and vice versa) over the time domain is noted for IMFs 4 and 5. Therefore, TDICC analysis is done on IMFs 1 and 6 of MJO index-6 to capture more information on its lag-dependent associations (Fig. 6). Significant correlation was noted only at lag1for IMF1, while on the other hand, strong negative correlation was noted at all the lags for IMF6.

Fig. 6
figure 6figure 6

TDICC analysis of MJO index-2 and rainfall: a IMF1, b IMF2, cIMF3, d IMF4, e IMF5 and f IMF6

4.7 MJO index-7

TDIC plot of MJO index-7 (Fig. 3) exposes IMF4, IMF5 and IMF6 as significant components with hardly any switchover in the nature of correlations and insignificant correlations. TDICC on significant IMFs of MJO index-7 explains that IMF4 retains a behavioral change in the time spell between 2003 and 2013 for all the lags (Fig. 7). On examining the pattern, it is evident that lags 10–12, along with lag 1, are considered to be the potential predictors of IMF4. TDICC analysis of IMF5 shows the first seven lags are relevant for the rainfall predictions at this process scale. For IMF6, the correlation patterns are found to be associated with high percentage of insignificant correlations with all the lags except lag 1, i.e., IMF6(t) can be predicted by considering IMF6(t-1) alone.

Fig. 7
figure 7figure 7

TDICC analysis of MJO index-7 and rainfall a IMF1, b IMF2, c IMF3, d IMF4, e IMF5 and f IMF6

4.8 MJO index-8

MJO index-8 depicts a very strong positive correlation (> 0.75) in all the IMFs as depicted in Fig. 3. Henceforth, all IMFs can have an influence on the monthly rainfall over India. TDICC analysis of IMF1 clearly shows lag 1 is the only input for the prediction of first high-frequency mode (Fig. 8). From IMF2, we have observed that there exists a negative long-range correlation for lag 2 and 3 which is in contrary to the behavior of IMF2 of MJO index-3. Such a reversal in behavior in the pattern of correlations is noted for all the IMFs except the third one. For IMF3, a similar correlation pattern is noticed with that of index-3. Despite the opposing nature of correlations, the patterns of correlation are strikingly similar to that of MJO index-3. Hence, the predictor selection followed for index-3 can be followed for index-8.

Fig. 8
figure 8figure 8

TDICC analysis of MJO index-8 and rainfall: a IMF1, b IMF2, c IMF3, d IMF4, e IMF5 and f IMF6

4.9 MJO index-9

TDIC analysis discovered the prominence of MJO index-9 as the most positively correlated (correlations > 0.9) index among the different indices. TDIC analysis showed that all the seven IMFs are found to be playing a role in the prediction of rainfall over India. The correlation patterns of all the IMFs are similar to that of index-4 with the positive behavior on the nature of associations. The TDICC analysis unmasked that the same lags considered for index-4can also be used for the prediction of different IMFs of MJO index-9. Unlike index-4, one more low-frequency mode (IMF7) is found to be significant for index-9 and the TDICC analysis showed that all the lags can be considered as potential inputs for its prediction.

4.10 MJO index-10

TDIC analysis of MJO index-10 showed that all IMFs up to six are relevant except IMF2. MJO index-10 behaves similar to index-5 for all relevant lags except for IMF4. The strong positive correlation noticed in the first three lags turns to strong negative from lag 9 and significant relations are obtained for lags 11 and 12 (Supplementary file Fig. S4).

From the TDIC analysis, it is noticed that the rainfall-MJO correlations are unchanging and strongly negative for all the IMFs for indices 2–5, while it is strongly positive for the IMFs of indices 8–10. The nature of correlations is unchanging and strongly negative for the low-frequency IMFs (5 and 6) of indices 2–5, irrespective of the lags. A similar positive behavior is noted for the indices of 8–10, irrespective of the lags. Again, it was noted that even though the correlation at a particular scale is negative (or positive), the nature of lagged correlation need not be the same.

The multiscale decomposition of the rainfall or climatic oscillations will decode the physical processes behind the phenomenon. The modes with different periodic scales with decipher the physical mechanisms behind the occurrence of the phenomenon. Processing such data in a time–frequency space can give a better insight into the processes and wavelet transform or HHT can help in such tasks. It is worth mentioning that even though all such mechanisms have a modulating effect, it needs to be contributing or amplifying the magnitudes over a specific period of time. Moreover, the inter-relationships with other climatic drivers and local processes or meteorological drivers may influence the phenomenon diversely in different years. This highly dynamic behavior makes the rainfall predictions highly complex. Moreover, when we develop a linear or nonlinear regression model for prediction of an IMF at generic time t, the coefficients of selected lagged values identified by TDICC only need to be considered as predictor variable, as the coefficients (weights) of other lagged values may be very small or insignificant which can be excluded in the modeling stage (Adarsh and Janga Reddy 2019). Eliminating the less contributing factors and inputs can significantly enhance the accuracy of predictions and reduce the computational complexity (Hu and Si 2013; Adarsh and Janga Reddy 2018). Thus, this study proposed a novel framework to perform the predictor selection to improve the prediction skills by introducing the application of TDICC in the field of hydro-climatology for the first time. The proposed approach selected 42 IMFs from the total of 80 IMFs obtained after decomposition and it identified 251 lags from 588 lagged values as significant, indicating the reduction in computational complexity. The method delivers the idea to use the significant IMFs and most relevant lags for further predictions of monthly rainfall and emphasis to avoid the unnecessary IMFs and lags to save from several avoidable computations involved in data-driven models. More experiments need to be solicited to demonstrate the potential applicability of the proposed method for the regional rainfall predication over India and a couple of such studies are in the pipeline.

5 Conclusions

This study investigated the multiscale teleconnection between MJO and monthly rainfall at All-India spatial scale using a novel EEMD-TDICC-coupled framework. Firstly, the monthly rainfall time series and MJO index series are decomposed using EEMD and this study explored ten such cases considering different longitudinal MJO indices. In each case, HHT-based TDIC analysis is employed to identify the most significant IMFs useful for rainfall prediction. Subsequently, TDICC analysis is invoked upon selected IMFs to identify the significant lags to be considered for rainfall prediction of a generic time step at a specific timescale. Specific conclusions of the study are:

  1. (1)

    MJO indices 1, 6 and 7 are susceptible to more transitions in correlation from positive to negative (and vice versa) along the time domain, while the remaining indices displayed more stable correlation patterns in different process scales.

  2. (2)

    MJO index-2 (100° E), MJO index-3(120° E), MJO index-4(140° E) and MJO index-5(160° E) are strongly negatively associated while MJO index-8(10° W), index-9(20° E) and index-10 (70° E) are strongly positively associated with the monthly rainfall. This implies that more rainy days can be expected in the longitudes corresponding to MJO indices 8, 9 and 10 than that of other MJO indices

  3. (3)

    For the prediction of IMF3 (annual periodicity) the lags 1–2 and 5–8 are significant for all the MJO index type

  4. (4)

    For the prediction of the low-frequency IMFs (mostly 5th IMF onward) of all the indices, all the 12 lags are found to be significant, implied by the unchanging and stable lagged correlation pattern.

  5. (5)

    The unchanging nature of correlations at a specific timescale in the MJO-rainfall relationships doesn’t imply invariant behavior in the link at the lagged timescales.