1 Introduction

Drought is defined as the lack of expected precipitation received from normal condition that could occur slowly in time, and its impact may span continuously for a period of time (Cancelliere et al. 2007). Drought is a natural disaster that could affect soil moisture, crop growth and other living creatures at any particular region and therefore could influence the demand of water supply for agriculture and other applications (Mishra et al. 2009; Sene 2010; Ahmad and Hashim 2010). In monetary values, causing an average of $6–$8 bilion in global anually. In a span of a century from 1900 to 2004, more than ten million people have died and close to two billion people were affected by drought (Wilhite 2000; Below et al. 2007). Several indices have been used for monitoring drought and evaluation of water supply deficit (Cancelliere et al. 2007; Aghrab et al. 2008; Mishra et al. 2009; Daneshvar et al. 2013). One of these indices is the Standardized Precipitation Index (SPI), which is used in this study.

SPI is an index that may be used to calculate the drought condition. Its application varies from describing to comparing droughts in different periods and regions with different climate conditions (Edossa et al. 2010; Fiorillo and Guadagno 2010; Xie et al. 2013). Many researchers have reported on the superiority of SPI over other ( McKee et al. 1995; Turkes and Tath 2009; Durdu 2010; Khalili et al. 2011; and Angelidis et al. 2012). It has the statistical property of consistency, the calculation is simple and it is also able to describe both short-term and long-term drought impacts through various time scales of precipitation anomalies.

The drought severity based on SPI could be analyzed using stochastic method as drought event is a random phenomenon. The method could then be used to identify the drought characteristics. Lohani and Loganathan (1997) and Lohani et al. (1998) analyzed the stochastic behavior of drought in Virginia USA using the non-homogeneous Markov chain method, while Paulo et al. (2005) and Paulo and Pereira (2007; 2008) applied both the homogeneous and non-homogeneous Markov chain models to investigate the stochastic characteristics of SPI drought class transition. They found that stochastic models are useful in monitoring the evolution of droughts and also are able to produce early warning clues regarding drought condition for any particular region. In addition, according to Bonaccorso et al. (2003), the estimation of return periods is necessary in order to improve the planning and management of water resources.

In Malaysia, series of drought condition have repeatedly occurred and becoming more frequent in recent years. The condition may e triggered by environmental event such as El Nino event in 1997–1998 and 2014 time period when longer drought duration is experienced especially in the urban area of the southwestern region. Drought situation in Malaysia often forced the authorities to ration the water supply to residential and business areas for several months, resulting in disruption in daily activities. As the frequency of drought occurrences has increased in recent years, this may imply clues of possible changes in global rainfall pattern (Ahmad and Low 2003; Ahmad and Hashim 2010). In this paper, SPI is used to analyze the drought condition at Peninsular Malaysia. SPI has been used to analyze drought condition in Peninsular Malaysia in several past studies such as by Ahmad and Hashim (2010) using data from Negeri Sembilan. Hanawi et al. (2011) and Zin et al. (2013) utilised SPI for Peninsular Malaysia data and reported that the western region of the peninsula experienced relatively drier condition compared to other parts. In another study, Zin et al. (2013) discovered that the longest duration for drought is expected to occur in the middle part of the peninsula based on 100-year return period and also found that altitude has an influence on the percentage occurrences of dry condition, whereby the areas located around the Main Range experience lesser dry condition compared to other areas for SPI of time scale of one and three.

Stochastic analysis has been used by Deni et al. (2009) in fitting the Markov chain models of various orders to the daily rainfall occurrence in Peninsular Malaysia. They found regardless of the monsoon seasons, the first order Markov chain model is optimum for the northwestern and eastern regions of the Peninsula at the level of rainfall threshold of 10 mm.

The objective of this paper is to investigate the drought characteristics of Peninsular Malaysia by using the first order homogeneous Markov chain approach. The drought characteristics considered are (a) the steady-state probability of drought, that is the probabilities of occurrence in various drought classes, (b) the mean residence time in each category of drought event that represents the average time the process stays in a particular drought category before moving to another category, (c) the mean recurrence time of drought event, that is the average time taken by the process to return to the same drought category, and (d) the average time taken by the process to reach for the first time the non-drought category from any drought category.

2 Methodology

2.1 The Standardized Precipitation Index (SPI)

In this study, One-month SPI is calculated for rainfall data obtained from 39 rain-gauge stations in Peninsular Malaysia. This time-scale is considered as a short time scale which is able to reflect the seasonality of the data and is appropriate to identify the drought impact on agriculture (Labedzki 2007; Moreira et al. 2008; Sene 2010). Details on SPI computation can be found in Blain (2012) and Du et al. (2013) as well as references therein. Once the SPI values have been computed for every station, these values are classified into four drought categories, as suggested by Moreira et al. (2006), namely, “non-drought” (SPI ≥ 0), “near-normal” (−1 < SPI < 0), “moderate” (−1.5 < SPI ≤ −1), and “severe” (SPI ≤ −1.5). This study will only focus on the moderate and severe drought conditions with SPI value of −1 or less.

2.2 Markov Chains

A Markov chain {X t t = 0, 1, 2, ⋯} process is a stochastic process with property that the process value at time t + 1, X t + 1, depends only on its value at time t or X t , such that for every t and all states i 0, …, i t + 1

$$ Prob\left({X}_{t+1}={i}_{t+1}\left|{X}_t={i}_t,\cdots, {X}_0={i}_0\right.\right)= Prob\left({X}_{t+1}={i}_{t+1}\left|{X}_t={i}_t\right.\right) $$
(1)

Let Prob(X t + 1 = j|X t  = i) = p ij be the transition probability from state i at time t to state j at time t + 1, then p ij can be represented in the transition probability matrix form, P, as follows

$$ P=\left[{p}_{ij}\right]=\left[\begin{array}{ccc}\hfill {p}_{11}\hfill & \hfill \cdots \hfill & \hfill {p}_{1s}\hfill \\ {}\hfill \vdots \hfill & \hfill \ddots \hfill & \hfill \vdots \hfill \\ {}\hfill {p}_{s1}\hfill & \hfill \cdots \hfill & \hfill {p}_{ss}\hfill \end{array}\right],\kern2em i,j=1,\dots, s, $$

where 0 ≤ p ij  ≤ 1, ∑ s j = 1 p ij  = 1, i = 1, …, s and s is the number of states.

Estimation of the transition probability is important in Markov chain modeling. Several methods have been proposed to estimate the transition probability, namely the maximum likelihood method and the empirical Bayes method (Meshkani and Billard 1992; Lohani et al. 1998; Bickenbach and Bode 2003; Paulo et al. 2005; Paulo and Pereira 2007; 2008; Mishra et al. 2009; Nalbantis and Tsakiris 2009). We employed the maximum likelihood method to estimate the transition probability due to its simplicity. The maximum likelihood estimator for p ij can be obtained as

$$ {\widehat{p}}_{ij}=\frac{f_{ij}}{F_{i.}}. $$
(2)

with f ij is a transition count and F i. = ∑ s j = 1 f ij , i, j = 1, …, s and s is the number of the drought categories.

2.2.1 The First Order and Time-Homogeneous Markov Chain Tests

A Markov chain {X t } is said to be of the rth order Markov chain (r = 1,, 2 …) if for every t and for all states i 0 …, i t + 1,

$$ \begin{array}{c}\hfill Prob\left({X}_{t+1}={i}_{t+1}\left|{X}_t={i}_t,\cdots, {X}_{t-r+1}={i}_{t-r+1},{X}_{t-r}\right.={i}_{t-r},\cdots, {X}_0={i}_0\right)\hfill \\ {}\hfill = Prob\left({X}_{t+1}={i}_{t+1}\left|{X}_t={i}_t,\cdots, {X}_{t-r+1}={i}_{t-r+1}\right.\right)\hfill \end{array} $$
(3)

The Markov chain property can be tested using the Likelihood Ratio (LR) test (Bickenbach and Bode 2003; Tan and Yilmaz 2002). The test is used to verify whether a Markov chain model follows either a first or second order. The test statistic is

$$ {\mathrm{LR}}_{\mathrm{order}}=2{\displaystyle \sum_{h,i=1}^s{\displaystyle \sum_{j\in {\mathrm{A}}_{hi}}{f}_{hij}\left[ \ln \left({\widehat{p}}_{hij}\right)- \ln \left({\widehat{p}}_{ij}\right)\right]},} $$
(4)

where f hij denotes the number of frequencies of drought category which moved to state j at time t + 1, given it was in state h at time (t − 1) and in state i at time t, \( {\widehat{p}}_{hij}=\frac{f_{hij}}{{\displaystyle {\sum}_{j=1}^s{f}_{hij}}} \), \( {\widehat{p}}_{ij}=\frac{f_{ij}}{{\displaystyle {\sum}_{j=1}^s{f}_{ij}}} \) and \( {A}_{hi}=\left\{j:{\widehat{p}}_{hij}>0\right\} \). The statistic follows a chi-square distribution with ∑ s i = 1 (a i  − 1)(b i  − 1) degrees of freedom (df) in which \( {a}_i\in {A}_i,{A}_i=\left\{j:\widehat{p}ij>0\right\} \) and b i  ∈ B i , B i  = {h : f hi  > 0}, f hi  = ∑ s j = 1 f hij .

The Markov chain {X t } is said to be of time-homogeneous or stationary if all states i and j satisfy

$$ Prob\left({X}_{t+v+1}=j\left|{X}_{t+v}=i\right.\right)= Prob\left({X}_{t+1}=j\left|{X}_t=i\right.\right),v=0,1,\dots, $$
(5)

The LR test can also be used to test the homogeneity property of a Markov chain (Bickenbach and Bode 2003; Tan and Yilmaz 2002). The time-homogeneity test is employed to verify whether the transition probabilities of the first-order Markov chain could be assumed constant over time. In order to test this assumption, the full sample period is divided into M sub periods (m = 1,…,12) and then the transition probabilities estimated from each of M subsamples are compared to the transition probabilities estimated from the full period. The test statistic is

$$ {\mathrm{LR}}_{\mathrm{homogeneity}}=2{\displaystyle \sum_{m=1}^M{\displaystyle \sum_{i=1}^s{\displaystyle \sum_{j\in {c}_{i;m}}{f}_{ij;m}\left[ \ln \left({\widehat{p}}_{ij;m}\right)- \ln \left({\widehat{p}}_{ij}\right)\right]}}} $$
(6)

where f ij;m is the number of times the drought class changes from state i at time t to state j at time (t + 1) within the mth sub period, \( {\widehat{p}}_{ij;m}=\frac{f_{ij;m}}{{\displaystyle {\sum}_{j=1}^s{f}_{ij;m}}} \), \( {\widehat{p}}_{ij}=\frac{f_{ij}}{{\displaystyle {\sum}_{j=1}^s{f}_{ij}}} \) and \( {C}_{i;m}=\left\{j:{\widehat{p}}_{ij:m}>0\right\} \). This statistic also follows a chi-square distribution with ∑ s i = 1 (c i  − 1)(d i  − 1) degrees of freedom in which c i  ∈ C i , \( {C}_i=\left\{j:\widehat{p}ij>0\right\} \) and d ∈ D i , D i  = {m : f i;m  > 0}, f i;m  = ∑ s j = 1 f ij;m .

2.2.2 The Steady-State Probability

Let {X t } be a homogeneous Markov chain at state S and P is the transition probability matrix which is independent of time. A vector π = [π 1 ⋯ π s ]t is defined as the steady-state probability of drought category, if elements of π are non-negative and also satisfy these two requirements:

$$ \begin{array}{ll}a.\hfill & \begin{array}{cc}\hfill {\displaystyle {\sum}_{i=1}^s{\pi}_i{p}_{ij}={\pi}_j}\hfill & \hfill j=1, \dots, \mathrm{s}\hfill \end{array}\hfill \\ {}b.\hfill & {\displaystyle {\sum}_{j=1}^s{\pi}_j=1}\hfill \end{array} $$
(7)

In matrix form, the linear equations system of Eq. (7) can be solved as follows

$$ \pi ={\left[{\left(P-I\right)}^t+E\right]}^{-1}e, $$
(8)

where I is an identity matrix, E is a unit matrix, e is a unit vector and [(P − I)t + E] is non-singular matrix.

2.2.3 The Mean Residence Time

Let p jj denote the transition probability of a Markov chain {X t } with drought category j and Rj is the residence time for any category j and

$$ \begin{array}{c}\hfill Prob\left({R}_j=n\right)= Prob\left({X}_{t+1}=j\left|{X}_t=j\right.\right)\dots Prob\left({X}_{t+n}\ne j\left|{X}_{t+n-1}=j\right.\right)\hfill \\ {}\hfill ={\left({p}_{jj}\right)}^{\left(n-1\right)}\left(1-{p}_{jj}\right)\hfill \end{array} $$
(9)

with n is a number of months. R j follows a geometric distribution with parameter (1 − p jj ). Thus, the mean residence time for any drought category j is given by

$$ \mathrm{E}\left({R}_j\left|{X}_t\right.\right)=\frac{1}{\left(1-{p}_{jj}\right)} $$
(10)

2.2.4 The Mean First Passage Time

Let T ij denote the time taken for a process to move for the first time in drought category i to category j. The random variable T ij is known as the first passage time from i to j. The mean first passage time from state i to state j, M ij , is defined as

$$ \begin{array}{l}{M}_{ij}=\mathrm{E}\left[{T}_{ij}=n\right]\hfill \\ {}{M}_{ij}=1+{\displaystyle {\sum}_{\begin{array}{c}\hfill k=1\hfill \\ {}\hfill k\ne 1\hfill \end{array}}^s{p}_{ik}{M}_{kj},}\ \mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{every}\ i,j=1,\dots, s,\hfill \end{array} $$
(11)

In the matrix form, Eq. (11) becomes

$$ M=E+P\left(M-{M}_d\right) $$
(12)

where M is a matrix with elements M ij , E is a unit matrix, P = [p ij ], and M d is the diagonal matrix whose elements, (M d ) jj  = M jj The mean first passage time, M jj , is called the mean recurrence time for any drought category j.

3 Data

This study is focused on Peninsular Malaysia (1°–7° North and 100°–104° East), which is located in the sub-tropical humid region and covers an area of about 131,794 km2. Peninsular Malaysia’s climate is influenced by two rainy seasons, namely the northeast monsoon from November to February and the southwest monsoon from May to August. The dry season in Malaysia usually occurs in the month of April–July during the southwest monsoon.

Monthly rainfall amount data (in mm) obtained from 39 rainfall stations in Peninsular Malaysia for the period of 1970–2008. The data were obtained from the Drainage and Irrigation Department, Malaysia. These stations were selected for analysis have less than 10 % missing data and longest period of data variability for all stations considered (Table 1). The missing data were imputed using the normal ratio and modified normal ratio estimation methods (Paulhus and Kohler 1952; Jamaluddin et al. 2008). In this analysis, spatial mapping using ordinary Kriging method is applied over the study area. Figure 1 displays the location of rainfall stations used in this study.

Table 1 Name of rainfall stations in Peninsular Malaysia, geographic coordinates, the mean annual maximum (MAM) rainfall, and missing data
Fig. 1
figure 1

Location of rainfall stations used in this study

Table 1 also contains the mean annual maximum rainfall values for all stations considered. The spatial distribution of this statistic is plotted in Fig. 2. Based on the information in Table 1, the value vary from 266.6 mm to 8591.97 mm which is distributed over study area. The highest value occurs at Endau (code 9) and the lowest value occurs at Sitiawan (code 39). When considering the maximum and minimum values, there exists large differences between locations, suggesting different pattern of rainfall behavior amongst all stations considered.

Fig. 2
figure 2

The spatial distribution of mean annual maximum rainfall (in mm)

In identifying the appropriate Markov chain order, from Table 2, based on 5 % significance level, it can be seen that the first order assumption is satisfied for all stations apart from Bertam (p-value = 0.016) and Sg. Kepasing (p-value = 0.025). On the test of homogeneity, two stations, namely Kluang Mersing (p-value = 0.045) and Pekan Tanjung Malim (p-value = 0.024) did not satisfy at significance level of 0.05. Thus, based on the analysis on the Markov chain’s order and homogeneity tests, these four stations will not be included for further analysis.

Table 2 The likelihood ratio (LR) test results in testing for Markov property and homogeneity

4 Results and Discussion

The steady state probability values of drought classes for each station which represent the probabilities of occurrence of the moderate (M) and severe (S) drought classes are available in Table 3. It can be seen that the probabilities of the two drought classes are almost similar with its average value of 0.076 and 0.074 for moderate (M) and severe (S) class, respectively. About 54 % of the stations has higher probability of being in M category as opposed to S category. Figure 3a depicts the spatial distribution of drought probability where it can be seen that although the middle area has relatively lower moderate drought probability, the same area has higher severe drought probability as compared to the other areas in the peninsula (Fig. 3b). Nevertheless, for both categories, these values are considered very small.

Table 3 The steady state probability and the mean residence time of droughts (MRST)
Fig. 3
figure 3

a Spatial distribution of moderate drought probability. b Spatial distribution of severe drought probability

The mean residence time (MRST) values, also known as the drought duration, are also presented in Table 3. It can be seen that all stations have MRST for moderate category approximately 1–1.3 months, while for the severe category; the duration is longer (up until 1.89 months). About 77 % of all stations considered experience longer duration for severe drought as opposed to moderate drought. This implies that, on the average, severe drought condition is expected to occur longer than the moderate drought condition in the peninsula. The spatial distribution for MRST of moderate drought (Fig. 4a) showed that, the longest moderate duration occurred in the north-western region, which lasted about 1.15 to 1.30 months. Based on Fig. 3a, this region also has higher moderate drought probability. In the meantime, Fig. 4b displays that the same area also experienced the relatively higher duration for the severe drought condition. Nevertheless, this area has smaller probability for severe drought compared to other areas. On the other hand, the southwestern part of the peninsula is expected to experience longer severe drought with relatively high probability.

Fig. 4
figure 4

a Spatial distribution of the mean residence time of moderate drought (months). b Spatial distribution of the mean residence time of severe drought (months)

Table 4 shows the results of the mean recurrence times (MRCT) and mean first passage time to reach the non-drought category (MFPT) for M and S categories. The MRCT values for M class varies from 9.43 to 17.86 months, while for S class, it ranges from 11.36 to 17.86 months. Location with a smaller MRCT value implies that similar drought condition occurs more frequently. The MFPT values for M class ranges from 1.73 to 2.73 months, while for S class, the range is from 1.17 to 3.24 months. These values implies that recovery time to non-drought condition is about 2 to 3 months regardless of any drought condition.

Table 4 The mean recurrence times of drought categories (MRCT) and the mean first passage time to reach the non-drought category from any drought category (MFPT)

Table 4 also shows that about 43 % of the stations experiences lower MRCT for severe category which implies that these stations will experience return to severe drought condition from the same condition quicker that the duration taken from moderate drought to return to the same category of moderate drought. Nevertheless, only 29 % of stations in the severe category with lower MFPT than those in the moderate category which implies that it will take longer time for most areas in the peninsula experiencing severe drought condition to return to non-drought category compared to the moderate drought condition. Spatial maps in Fig. 5a–b display pattern of the MRCT for the two drought categories. From Fig. 5a, it could be observed that the north-western region has the mean recurrence time of moderate drought of 12–18 months, whereas some of the stations in middle region experienced severe drought event with lower mean recurrence time of about 11.2–14.2 months (Fig. 5b).

Fig. 5
figure 5

a Spatial distribution of the mean recurrence time of moderate drought (months). b Spatial distribution of the mean recurrence time of severe drought (months)

5 Conclusions

In this paper, the first order homogeneous Markov chain model is used to identify the stochastic characteristics of droughts in Peninsular Malaysia through the analysis of probability for both moderate and severe conditions, the residence and recurrence times in each drought condition, as well as times for any drought conditions to recover to the non-drought condition. We have shown that the probability and mean recurrence time of drought condition generally increase according to the degree of drought severity. The average probabilities between moderate and severe drought events are almost similar, severe drought condition is found to be more persistent than moderate drought condition. During the study period, it was also found that any study areas that experienced moderate or severe drought condition would generally recover to non-drought condition in about 2 to 3 months time.

Spatial distribution showed that the northwestern region is more susceptible to experience moderate drought, which also occurred most frequently with a longer duration. On the other hand, the middle region experienced the most frequent severe drought condition. These results suggest that several areas in Peninsular Malaysia, particularly the northwestern and middle regions are facing dry condition.

In this study, several important drought characteristics including areas which are prone to drought in Peninsular Malaysia have been identified using the first-order homogeneous Markov chain for rainfall data. As at the moment, there are limited studies on the drought situation in Peninsular Malaysia, these results could provide useful information for the agriculture experts and irrigation engineers to plan possible measure on lessening the negative impacts of drought.