1 Introduction

With the development of industrialization and the rapid growth of population, the management of water resources is becoming more important not only in Malaysia but throughout the world. The analysis of rainfall behavior, particularly in terms of the amount of rainfall occurrence is beneficial for managing the consumption of water. Moreover, this analysis could provide information to the experts who will know the spatial and temporal structure of rainfall occurrence in order to give a reliable prediction about the extreme weather events.

The study on the sequence of daily rainfall occurrence was started by Gabriel and Neumann (1962). They found that the daily rainfall occurrence for the Tel Aviv data was successfully fitted with the first-order Markov chain model. Meanwhile, Kottegoda et al. (2004) reported that the first order of the Markov chain model was found to fit the observed data in Italy successfully. This model was based on the assumption that there was a dependency of the daily rainfall occurrence to that of the previous day. Stern and Coe (1984) stated that the two most attractive features of the Markov chain models involved providing the ease in identifying the seasonality in daily rainfall occurrence and allowing for the flexibility in the building of probability models for the purpose of parameter estimation and model selection.

In most cases, the daily rainfall occurrence can be described by the Markov chain of the first-order model; however, there are cases where this model failed to fit the observed data. As an alternative, the use of the Markov chain model of higher order often improved these inadequacies (see Wilks 1999; Hayhoe 2000). The two criteria that are most commonly used in determining the optimum order for the Markov chain models of higher order are Akaike’s information criteria (AIC; see Akaike 1974; Tong 1975) and Bayesian information criteria (BIC; see Schwarz 1978; Katz 1981). The identification of the optimum order of the Markov chain models for the daily rainfall data has been discussed in many studies. Chin (1977), for example, has reported that the optimum order of the Markov chain model was found to vary according to seasons and geographical regions, where the first order was found optimum for summer, while a higher order was observed to be more suitable for winter, based on the fitted models for 100 rainfall stations in the United States. Gates and Tong (1976) had re-examined the Tel Aviv data and revealed that the second-order Markov chain model was found to be optimum instead of the first order. Moreover, Jimoh and Webster (1996) had investigated the optimum order of the daily rainfall occurrence at five stations in Nigeria. Based on the AIC and BIC, they found that the optimum order depends on the threshold levels used and the length of period for the recorded data.

Determining the most appropriate order of the Markov chain model for describing the sequences of wet (dry) days is pertinent in explaining the rainfall behavior. According to Chin (1977), any model which is found to describe the daily rainfall occurrence successfully should also be able to represent the distribution of wet (dry) spells well. The studies on identification of the best order of the Markov chain models to describe the distribution of wet and dry spells have been carried out by a number of researchers at various locations, as in the case of Fleming (2007) and Hui et al. (2005) for Canada; Tolika and Maheras (2005) and Anagnostopoulou et al. (2003) for Greece; Harrison and Waylen (2000) for Costa Rica; Wilks (1999) for the USA; Martin-Vide and Gomez (1999) for Spain; Chapman (1997) for the western Pacific; Dahale et al. (1994) for the tropical South-East Asia and Equatorial Pacific; Moon et al. (1994) for South Korea; Singh et al. (1981) for India; Quah and Ooi (1985) for Peninsular Malaysia; Berger and Goosens (1983) for Belgium; and Basu (1971) for Calcutta.

Although a lot of work on the identification of the best order of the Markov chain models has been done on rainfall data from various parts of the world, only a little attention has been given to the Malaysian data. No study has been conducted to determine the optimum order for the Markov chain models of the higher order for the daily rainfall in Peninsular Malaysia. Since the study of Quah and Ooi (1985), no further investigation has been done to determine the most appropriate order of the Markov chain models in representing the distribution of wet (dry) spells in this country. Hence, this study is conducted and it is structured as follows. After a brief description of the rainfall data in section Data and methods, the explanation on how to determine the optimum order of the Markov chain for n-arbitrary sequences is presented in the section Identifying the optimum order of Markov chain models for daily rainfall occurrences. In the section Persistency of wet (dry) events, the Markov models are described to explain the persistency of the wet (dry) events. This is followed by section Fitting wet (dry) spells using various order of Markov chain models where the most appropriate order of the Markov chain models is identified using the observed distribution of wet (dry) spells data. In sections Results and discussion and Conclusions, the results of the analysis and conclusion of this study are provided respectively.

2 Data and methods

Peninsular Malaysia lies entirely in the equatorial zone, situated in the northern latitude between 1o and 6oN and the eastern longitude from 100o to 103oE. The climate of Peninsular Malaysia is influenced by the Southwest monsoon, from May to August, and the Northeast monsoon, from November until February. The Southwest monsoon is a drier period for the whole country, while during the Northeast monsoon, the east coast and northern areas of Peninsular Malaysia receive more heavy rains than the other parts of the country.

The daily rainfall data from 18 selected stations for Peninsular Malaysia, as provided in Table 1, were selected based on the completeness of the data and the longer period of records. All the data used in this study were collected from the database of the Malaysian Meteorological Services (MMS). The number of years of record for these stations ranged from 21 to 35 years, where 10 stations had records of 35 years, and the rest of the stations had a length of records ranging from 21 to 32 years. The difference in the length of periods for the different stations is due to the availability of the data. Corresponding to the idea by Lim (1976) in this study, the peninsula is divided into five regions, to investigate whether the persistency of wet (dry) days varies according to the different regions of the country. Figure 1 displays a map of Peninsular Malaysia which is divided into the five regions, namely; northwest (NW), west (W), southwest (SW), east (E) and central (C).

Fig. 1
figure 1

Rainfall stations and rainfall regions in Peninsular Malaysia. NW northwest; SW southwest; E east; C central; W west

Table 1 Geographic coordinates, altitude and period of records for the selected 18 rain gauged stations in Peninsular Malaysia. NW northwest; SW southwest; E east; C central; W west

In this study, a wet day can be defined as a day when the rainfall amount exceeds or equals a particular threshold value. A day with the rainfall amount lesser than 0.1 mm is considered as a dry day, when 0.1 mm is assumed as the threshold value. Meanwhile, for the threshold of 10 mm, a dry day can be defined as a day with a small amount of rainfall, less than 10 mm. Otherwise, the day is considered as a wet day. The purpose of applying two different levels of threshold is to investigate whether the optimum order of the Markov chain model is dependent on the level of threshold used.

3 Identifying the optimum order of Markov chain models for daily rainfall occurrences

Let X 1 ,X 2 ,…,X t ,…X n denote n binary variables to represent the sequences of wet and dry events in the daily rainfall occurrence for n–arbitrary days, indicated as 1 and 0, respectively. The sequence of a wet (dry) day is assumed to follow a Markov chain model of the first order in the events of wet or dry at time t, when X t, depends on the previous event X t–1. Let P 10 denotes the conditional probability of a wet day following a dry day and P 11 denotes the conditional probability of a wet day following a wet day. Then these two conditional probabilities can be given by

$$P_{{10}} = P{\left( {X_{{\text{t}}} = 1\left| {X_{{t - 1}} = 0} \right.} \right)}\quad{\text{and}}\quad P_{{11}} = P{\left( {X_{t} = 1\left| {X_{{t - 1}} = 1} \right.} \right)}$$
(1)

For the Markov chain model of k th order, the stationary transition probabilities are given by

$$P_{i_{k + 1} , \ldots ,i_1 } = P\left( {X_t = i_{k + 1} \left| {X_{t - 1} = i_k ,\; \ldots ,X_{t - k} = i_1 } \right.} \right);\;i = 0,1$$
(2)

The joint probability distribution for X 1,X 2,…,X t,…,X n can be given by

$$P\left( {X_n = i_n , \ldots ,X_1 = i_1 } \right) = P\left( {i_n ,i_{n - 1} , \ldots ,i_1 } \right)$$
(3)

Equation (2) can also be written as

$$\begin{aligned} & P{\left( {i_{n} ,i_{{n - 1}} , \ldots ,i_{1} } \right)} = P{\left( {\left. {i_{n} } \right|i_{{n - 1}} , \ldots ,i_{{n - k}} } \right)}P{\left( {\left. {i_{{n - 1}} } \right|i_{{n - 2}} , \ldots ,i_{{n - k - 1}} } \right)} \ldots P{\left( {\left. {i_{{k + 1}} } \right|i_{k} , \ldots ,i_{1} } \right)}P{\left( {\left. {i_{k} } \right|i_{{k - 1}} , \ldots ,i_{1} } \right)} \ldots P{\left( {\left. {i_{2} } \right|i_{1} } \right)}P{\left( {i_{1} } \right)} \\ & \quad \quad \quad \quad \quad \quad = P_{{i_{1} }} P_{{i_{2} i_{1} }} \ldots P_{{i_{k} , \ldots ,i_{1} }} {\prod\limits_{j = 1}^{n - k} {P_{{i_{{k + j}} , \ldots ,i_{j} }} } } \\ \end{aligned} $$
(4)

where \(P_{i_1 } = P\left( {i_1 } \right)\).

In order to obtain the optimum order of the Markov chain model, AIC and BIC can be applied. Both criteria are based on the log-likelihood functions for the transition probabilities of the fitted Markov chain models. The maximum likelihood function for the k th order chain is given by

$$L_k \left( {X_1 , \ldots ,X_n } \right) = \prod\limits_{s_k , \ldots ,s_1 } {\hat P_{s_k , \ldots ,s_1 }^{n_{_{s_k , \ldots ,s_1 } } } } $$
(5)

where, \(\hat P_{s_k , \ldots ,s_1 }^{n_{s_k , \ldots ,s_1 } } \) denotes the estimated transition probabilities of each of the sequence going from state s1 to s2, from state s2 to s3, and out to state sk–1 to sk, where sk is the state of the most recent observation; and the superscript \(n_{s_k , \ldots ,s_1 } \) denotes the associated transition counts. The maximum likelihood estimator used in Eq. (5) of the transition probabilities of Eq. (2) is given by,

$$\ifmmode\expandafter\hat\else\expandafter\^\fi{P}_{{s_{{\text{k}}} , \ldots ,s_{1} }} =\frac{{n_{{s_{{\text{k}}} , \ldots ,s_{1} }} }}{{n_{{ \bullet s_{{{\text{k}} - 1}} , \ldots ,s_{1} }} }}$$
(6)

with \(n_{{ \bullet s_{{{\text{k}} - 1}} , \ldots ,s_{1} }} = {\sum\limits_{s_{{\text{k}}} } {n_{{s_{{\text{k}}} , \ldots ,s_{1} }} } }\)

The comparison of the two different Markov chain models to decide on the optimum order, say the Markov chain models of k th and r th order where k < r and k = 0,1,…,r–1 can be based on the maximized likelihood ratio statistics which are computed as

$$\eta _{k,r} = - 2\ln \lambda _{k,r} $$
(7)

where \(\lambda _{k,r} = \frac{{L_k \left( {X_1 , \ldots ,X_n } \right)}}{{L_r \left( {X_1 , \ldots ,X_n } \right)}}\)

The selection of the Markov chain model of the optimum order, say k th order, involves the use of minimum loss function as obtained from the AIC or BIC. When assessing the k th order of the Markov chain model, Tong (1975) proposed the loss function as denoted by \({\text{AIC}}\;(k)\) to define the risk on the basis of the AIC criteria which can be given by

$$AIC\left( k \right) = \eta _{k,r} - 2\nu $$
(8)

where \(\nu = \left( {S^r - S^k } \right)\left( {S - 1} \right)\) is the degree of freedom and S represents the number of states. Schwarz (1978) introduced a different form of loss function to define the risk on the basis of the BIC criteria, denoted as BIC(k), which can be given by

$$BIC\left( k \right) = \eta _{k,r} - v\ln \left( n \right)$$
(9)

where n is the sample size. The only difference between the two criteria is the form of the penalty function. These criteria attempt to find the value of k that minimizes the loss functions. When identifying the optimum order of the Markov chain models in the daily rainfall occurrences, these criteria were also applied by many researchers (e.g., Chin 1977; Katz 1981; Wilks 1995; Jimoh and Webster 1996; Harrison and Waylen 2000; Kottegoda et al. 2004; Hui et al. 2005).

4 Persistency of wet (dry) events

The existence of persistence in the sequences of weather events can be measured by Besson’s coefficient of persistence (BCP) which can be defined as follows:

$$BCP = \frac{{1 - P_1 }}{{1 - P_{11} }} - 1$$
(10)

where P 1 is the probability of a wet day and P 11 is the conditional probability of a wet day given that the previous day was also a wet day. The positive value of BCP indicates the occurrence of a wet (or dry) event followed by an immediate preceding event. A zero BCP value reflects that there is no persistence for a similar event. The details of the derivation of BCP was explained by Besson (1924) based on an analysis of the persistence of the daily rainfall sequences in Paris. This method was applied by many other researchers (e.g., Erikson 1965; Singh et al. 1981; Berger and Goosens 1983; and Dahale et al. 1994).

5 Fitting wet (dry) spells using various order of Markov chain models

A wet (dry) spell is defined as a period of consecutive days of exactly, say, n wet (dry) days, occurring exactly before a dry (wet) day and returning to the wet (dry) condition in the \((n + 1)^{th} \) day. The first-order Markov chain only takes into account the condition of the state, either wet or dry, for one preceding day. Similarly, the second order considers the states of the two preceding days and so on. The Markov chain models up to order four has been applied in this present study. The joint probabilities of the k th order of the Markov chain models, are defined as,

$$P\left( {i_n , \ldots ,i_2 \left| {i_1 } \right.} \right) = P_{i_2 i_1 } \ldots P_{i_{k - 1} \ldots i_1 } \prod\limits_{j = 1}^{n - k} {P_{i_{k + j} , \ldots ,i_j } } ,\quad n \geqslant k + 1;\quad k = 0,1 \ldots $$
(11)

The conditional probabilities of a spell of n wet days under the k th order Markov chain, which is denoted as \(P\left( {0,\left[ n \right]\left| 0 \right.} \right)\), are given by,

$$P\left( {0,\left[ n \right]\left| 0 \right.} \right) = \left\{ \begin{aligned} & \prod\limits_{i = 1}^n {P_{\left[ i \right]0} P_{0\left[ n \right]0} \quad \quad ,\quad n \leqslant k - 1} \\ & \prod\limits_{i = 1}^k {P_{\left[ i \right]0} P_{\left[ {k + 1} \right]}^{n - k} P_{0\left[ k \right]} \;,\quad n \geqslant k} \\ \end{aligned} \right.$$
(12)

where [n] represents the n times. For example, the conditional probability of two consecutive wet days can be written as \(P\left( {011\left| 0 \right.} \right)\).

In order to compute the expected number of wet days, the conditional probabilities according to the respective length, say, 1, 2, 3, …, n days, which is obtained from Eq. (12) are multiplied by the total number of dry days. The Kolmogorov-Smirnov goodness-of-fit test is employed to compare the observed distribution and the expected distributions of the wet spells for the first few orders of the Markov chain models. The maximum absolute difference between the two cumulative values of the observed and expected number of wet days under the assumed Markov chain models are computed and if these values are found to be less than or equal to the critical value D 0.05, the particular Markov chain models are considered as most appropriate for describing the distribution of the wet spells. A similar approach is applied to the sequences of dry days.

6 Results and discussion

6.1 Characteristics of the sequence of wet (dry) days

The conditional probabilities of a wet (dry) event at each station are estimated for both levels of threshold and averaged out according to the regions and monsoon seasons. In Table 2, it is found that for both the monsoon seasons and levels of threshold, the conditional probability of a wet day given the previous day was wet, P 11, is substantially higher than the probability of wet, P 1, for all the stations. When the levels of persistence of wet spells are compared based on the two levels of threshold, the persistence level is found to be higher for the threshold level of 0.1 mm. For threshold 10 mm, the difference between the successive conditional probabilities of wet days reduces dramatically and becomes negligible after about 2 or 3 days; however, this difference reduces after about 4 or 5 days for the threshold 0.1 mm. Similarly, the conditional probabilities of dry events show that the probability P 00 is substantially higher than P 0. The difference between the successive conditional probabilities of dry days reduces progressively and becomes negligible after about 5 days and about 2 or 3 days for thresholds 0.1 and 10.0 mm respectively. As expected, the lower level of threshold value will produce a slightly longer duration of wet (dry) periods. The analysis of intense wet spells (e.g., threshold 10 mm) shows that the conditional probabilities of the wet spells, P 11, P 111, P 1111 and P 11111 are lower than those of the dry spells, P 00, P 000, P 0000 and P 00000 during both the monsoon seasons at all regions. In addition, the conditional probabilities of \(P_{1111} \) and \(P_{11111} \) are found decreased over the southwestern, eastern and the central parts of Peninsular Malaysia for the Southwest monsoon and also in the northwestern areas during the Northeast monsoon. This may be due to the fact that the intense wet spells seldom lasts for more than four consecutive days in most regions over the peninsula. Thus, it is evident to consider the fourth order Markov (r = 4) as the maximum order of the Markov process. According to Chin (1977), the occurrence of wet (dry) events is highly unlikely to have more than the fourth-order conditional dependence.

Table 2 Persistency characteristics of daily rainfall occurrence for both monsoons at five regions in Peninsular Malaysia during the period of records for both levels of threshold, 0.1 mma 10.0 mmb. NW northwest; SW southwest; E east; C central; W west

Further analysis of the persistency of the wet (dry) events will be explored by applying Besson’s coefficient of persistence (BCP). When the Northeast monsoon is considered, the results in Table 2 indicate that the BCP value for the stations located in the eastern region is the highest (0.74) when compared to the other regions; however, it shows the lowest value of 0.15 during the Southwest monsoon for both levels of threshold. For a lower level of threshold used, the average length of wet and dry spells during both monsoon seasons at all regions ranges between 3 to 4 days and 3 to 5 days, respectively. However, for a higher level of threshold used, the average length of dry spells of 6–12 days, is observed to be much longer than that of the wet spells, which is 1–2 days. Obviously, the results indicate that the persistence of dry sequences during the Northeast monsoon at the northwestern areas is much longer if compared to the other regions. The results of the study seem to be in agreement with the findings of Dale (1960), Camerlengo (1999) and Deni et al. (2008) who found that the dry spells were largely dependent on latitude and were longer and more frequent in the north than in the south and were more frequent during the Northeast monsoon.

6.2 The optimum order of Markov chain models

In determining the optimum order of the Markov chain models for a 5-day sequence in the daily rainfall occurrence, the minimum loss function obtained from the two decision criteria, namely; the AIC and BIC, will be applied. As an example, Table 3 illustrates that the loss function obtained from the AIC and BIC (see Eqs. 8 and 9) consistently shows that the third order is found optimum in almost all the stations for threshold 0.1 mm. However, for a higher level of threshold used, e.g. 10.0 mm, based on the BIC, the optimum order for the stations located in the northwestern and western areas is found to be slightly lower than that of the AIC. For the rest of the analysis, Table 4 displays the summary of the optimum order for a 5-day sequence of wet days at each region considering both the levels of threshold used and both the monsoon seasons. It can be seen that the optimum order obtained from both, decision criteria varies with season, region and threshold used. Generally, for both the AIC and BIC and the thresholds considered, the Markov chain of higher order is found to be optimum at most stations during the Northeast monsoon. Moreover, regardless of the monsoon seasons, for the two levels of threshold used, the results show that the Markov chain model order higher than one is found to be optimum at all stations in the peninsula. However, for the Southwest monsoon, the results show that the zero and the first-order model are found to be optimum at most of the stations located in the northwestern and eastern areas, when the higher level of threshold of 10 mm, is considered. It is also remarkable that the optimum order estimated by the AIC is either greater or equals to the BIC at most stations for thresholds 0.1 and 10 mm. The findings of this present study is supported by Katz (1981) who claimed that the optimum order obtained based on the AIC was inconsistent and had a tendency to over-estimate the true optimum order, whereas the BIC produced more consistent results.

Table 3 The loss function obtained from AICa and BICb for the Markov chain model of k th order, with r = 4
Table 4 The optimum order of Markov chain models for the 5-day sequence at five regions in Peninsular Malaysia over the period of records. NW northwest; SW southwest; E east; C central

6.3 The most appropriate order for the wet (dry) spells

The spatial distribution of the most appropriate order of Markov chain models to explain the wet (dry) spells (see section Fitting wet (dry) spells using various order of Markov chain models) over Peninsular Malaysia is identified through the Kolmogorov-Smirnov goodness-of-fit test. Figures 2 and 3 provide the observed and expected frequencies for both levels of threshold based on the most appropriate order of the first four orders of the Markov chain models identified as MC1, MC2, MC3 and MC4 for the distribution of the dry and wet spells for each of the four selected stations, respectively. It is obvious that the expected frequencies of dry and wet spells which are obtained from the most appropriate order of the Markov chain models really describe the observed distributions. It is also found that a higher order of the Markov chain models are observed to be more appropriate in order to describe the distribution of dry spells at the four selected stations, for both levels of threshold considered, as shown in Fig. 2. Meanwhile, Fig. 3 indicates that the first and second order are found to be the most appropriate models to describe the distribution of wet spells at each of the four selected stations.

Fig. 2
figure 2

The observed (white strip) and estimated (black dot) frequencies of dry spells for thresholds a 0.1 and b 10 mm based on the most appropriate order of Markov chain models (MC2, MC3 and MC4) at each of the four selected rainfall stations

Fig. 3
figure 3

The observed (white strip) and estimated (black dot) frequencies of wet spells for thresholds a 0.1 and b 10 mm based on the most appropriate order of Markov chain models (MC1 and MC2) at each of the four selected rainfall stations

Figures 4 and 5 display the results of the analyses on the most appropriate order for distribution of the wet and dry spells respectively at each of the selected rainfall stations in Peninsular Malaysia. As shown in Fig. 4, it is generally found that the Markov chain of higher order is appropriate for describing the distribution of dry spells during the Northeast monsoon at most stations except for the stations at Subang and Petaling. However, during the Southwest monsoon, for both levels of threshold, the first-order model is found appropriate at all stations except for the four stations located in Subang, Sitiawan, Ipoh and Temerloh. Regardless of the monsoon season, the results for both levels of threshold consistently show that the Markov chain of higher order is found to be appropriate for representing the distribution of dry spells at all stations in Peninsular Malaysia.

Fig. 4
figure 4

The most appropriate order of Markov chain models for the distribution of dry spells at all stations in Peninsular Malaysia over the period of records, ac for threshold = 0.1 mm and df for threshold 10.0 mm

Fig. 5
figure 5

The most appropriate order of Markov chain models for the distribution of wet spells at all stations in Peninsular Malaysia over the period of records, ac)for threshold = 0.1 mm and d)f for threshold 10.0 mm

Figure 5 displays the most appropriate order of the first four orders of the Markov chain models in representing the distribution of wet spells for each of the rainfall stations considered in this study. For the higher level of threshold applied, Fig. 5d–f indicates that the first-order model is found to be the most appropriate in representing the distribution of intense wet spells in most of the stations over the peninsula. Moreover, for the lower threshold level of 0.1 mm, the Markov chain of higher order is found to be more appropriate in the eastern, western and central parts of the peninsula, regardless of the monsoon seasons. Generally, the first-order Markov chain model is found more appropriate in describing the distribution of the wet spells at most regions during the Northeast and Southwest monsoons except in the western areas for the lower level of threshold of 0.1 mm. It can be concluded here that the Markov chain of higher order is more appropriate for the dry spells than for the wet spells. Similar findings had also been found by Dahale et al. (1994).

7 Conclusions

The information of wet (dry) behavior is vitally important to the water resource management which can be used for the prediction of extreme weather events. The analysis of persistency of wet (dry) spells contributes to the information of the occurrence of a wet or dry event based on the previous day. The results obtained from the analysis will provide a more complete description on the rainfall occurrences behavior in Peninsular Malaysia. In this present study, the Markov chain models of various orders were applied to the daily rainfall data in order to identify the optimum order of the model which was able to represent the wet (dry) spells by considering the various geographical regions, different levels of threshold used and also the monsoon seasons.

It can be concluded from this study that the optimum order of the Markov chain models for the 5-day of daily rainfall occurrence for Peninsular Malaysia varies according to the levels of threshold used, the regions and types of monsoon seasons. Generally, for all cases, the results show that the optimum order obtained by using the AIC is higher compared to the results of the alternative method BIC. The findings support the results of simulation theory of Katz (1981), which indicated that the AIC produced inconsistent results compared to the BIC. In addition, Jimoh and Webster (1996) also claimed that the estimation of the optimum order using the AIC tended to overestimate the results. Moreover, the order greater than one is more appropriate for Peninsular Malaysia for both thresholds regardless of the monsoon seasons and using both the decision criteria, AIC and BIC. Generally, the lower order of the Markov chain models is found to be optimum in the northwestern and the eastern areas during the Southwest monsoon season for a higher level of threshold, 10 mm.

However, the results show that the optimum orders for the Northeast monsoon are equal to or higher than those of the Southwest monsoon, when both levels of threshold are considered. This present study seems to be in agreement with Dahale et al. (1994) who claimed that the closer the stations are to the equator the higher the order of the models. Their findings indicated that the southern stations in Thailand and the equatorial Pacific which are closer to the equator showed a higher order of the Markov chains. Since the higher order of the Markov chain models is found to be optimum for the Northeast monsoon, it is shown in this present study that for the dry spells, the order higher than one is also found to be more appropriate in most of the stations in the peninsula for this monsoon season. This also supports the analysis of persistency which indicates that the BCP of threshold 0.1 mm during the Northeast monsoon is higher when compared to the Southwest monsoon. Thus, the results may provide evidence that the higher the order of the chain, the greater the consideration of the persistence of wet (dry) events. On the other hand, for the wet spells, the first order of the Markov chain model is found to be the most appropriate order in most of the stations over the peninsula for both monsoon seasons and levels of threshold considered.

Further analysis could be carried out by including other rainfall stations in Malaysia so that more valuable information could be found for the purpose of prediction. In the future, the most appropriate probability models by comparing the various orders of the Markov models with the alternative rainfall occurrence models in representing the distribution of wet (dry) spells will be investigated further over Peninsular Malaysia by considering the monsoon seasons and other climatic issues such as global warming or climate change.