Fitting optimum order of Markov chain models for daily rainfall occurrences in Peninsular Malaysia

Deni, Sayang Mohd; Jemain, Abdul Aziz; Ibrahim, Kamarulzaman

doi:10.1007/s00704-008-0051-3

Fitting optimum order of Markov chain models for daily rainfall occurrences in Peninsular Malaysia

Original Paper
Published: 23 August 2008

Volume 97, pages 109–121, (2009)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Theoretical and Applied Climatology Aims and scope Submit manuscript

Fitting optimum order of Markov chain models for daily rainfall occurrences in Peninsular Malaysia

Download PDF

Sayang Mohd Deni¹,
Abdul Aziz Jemain² &
Kamarulzaman Ibrahim²

614 Accesses
28 Citations
Explore all metrics

Abstract

The analysis of the daily rainfall occurrence behavior is becoming more important, particularly in water-related sectors. Many studies have identified a more comprehensive pattern of the daily rainfall behavior based on the Markov chain models. One of the aims in fitting the Markov chain models of various orders to the daily rainfall occurrence is to determine the optimum order. In this study, the optimum order of the Markov chain models for a 5-day sequence will be examined in each of the 18 rainfall stations in Peninsular Malaysia, which have been selected based on the availability of the data, using the Akaike’s (AIC) and Bayesian information criteria (BIC). The identification of the most appropriate order in describing the distribution of the wet (dry) spells for each of the rainfall stations is obtained using the Kolmogorov-Smirnov goodness-of-fit test. It is found that the optimum order varies according to the levels of threshold used (e.g., either 0.1 or 10.0 mm), the locations of the region and the types of monsoon seasons. At most stations, the Markov chain models of a higher order are found to be optimum for rainfall occurrence during the northeast monsoon season for both levels of threshold. However, it is generally found that regardless of the monsoon seasons, the first-order model is optimum for the northwestern and eastern regions of the peninsula when the level of thresholds of 10.0 mm is considered. The analysis indicates that the first order of the Markov chain model is found to be most appropriate for describing the distribution of wet spells, whereas the higher-order models are found to be adequate for the dry spells in most of the rainfall stations for both threshold levels and monsoon seasons.

Evaluation of Stochastic Daily Rainfall Data Generation Models

Selection of the best probability models for daily annual maximum rainfalls in Egypt

Article 30 March 2021

Variations of characteristics of consecutive rainfall days over northern Thailand

Article 14 July 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

With the development of industrialization and the rapid growth of population, the management of water resources is becoming more important not only in Malaysia but throughout the world. The analysis of rainfall behavior, particularly in terms of the amount of rainfall occurrence is beneficial for managing the consumption of water. Moreover, this analysis could provide information to the experts who will know the spatial and temporal structure of rainfall occurrence in order to give a reliable prediction about the extreme weather events.

The study on the sequence of daily rainfall occurrence was started by Gabriel and Neumann (1962). They found that the daily rainfall occurrence for the Tel Aviv data was successfully fitted with the first-order Markov chain model. Meanwhile, Kottegoda et al. (2004) reported that the first order of the Markov chain model was found to fit the observed data in Italy successfully. This model was based on the assumption that there was a dependency of the daily rainfall occurrence to that of the previous day. Stern and Coe (1984) stated that the two most attractive features of the Markov chain models involved providing the ease in identifying the seasonality in daily rainfall occurrence and allowing for the flexibility in the building of probability models for the purpose of parameter estimation and model selection.

In most cases, the daily rainfall occurrence can be described by the Markov chain of the first-order model; however, there are cases where this model failed to fit the observed data. As an alternative, the use of the Markov chain model of higher order often improved these inadequacies (see Wilks 1999; Hayhoe 2000). The two criteria that are most commonly used in determining the optimum order for the Markov chain models of higher order are Akaike’s information criteria (AIC; see Akaike 1974; Tong 1975) and Bayesian information criteria (BIC; see Schwarz 1978; Katz 1981). The identification of the optimum order of the Markov chain models for the daily rainfall data has been discussed in many studies. Chin (1977), for example, has reported that the optimum order of the Markov chain model was found to vary according to seasons and geographical regions, where the first order was found optimum for summer, while a higher order was observed to be more suitable for winter, based on the fitted models for 100 rainfall stations in the United States. Gates and Tong (1976) had re-examined the Tel Aviv data and revealed that the second-order Markov chain model was found to be optimum instead of the first order. Moreover, Jimoh and Webster (1996) had investigated the optimum order of the daily rainfall occurrence at five stations in Nigeria. Based on the AIC and BIC, they found that the optimum order depends on the threshold levels used and the length of period for the recorded data.

Determining the most appropriate order of the Markov chain model for describing the sequences of wet (dry) days is pertinent in explaining the rainfall behavior. According to Chin (1977), any model which is found to describe the daily rainfall occurrence successfully should also be able to represent the distribution of wet (dry) spells well. The studies on identification of the best order of the Markov chain models to describe the distribution of wet and dry spells have been carried out by a number of researchers at various locations, as in the case of Fleming (2007) and Hui et al. (2005) for Canada; Tolika and Maheras (2005) and Anagnostopoulou et al. (2003) for Greece; Harrison and Waylen (2000) for Costa Rica; Wilks (1999) for the USA; Martin-Vide and Gomez (1999) for Spain; Chapman (1997) for the western Pacific; Dahale et al. (1994) for the tropical South-East Asia and Equatorial Pacific; Moon et al. (1994) for South Korea; Singh et al. (1981) for India; Quah and Ooi (1985) for Peninsular Malaysia; Berger and Goosens (1983) for Belgium; and Basu (1971) for Calcutta.

Although a lot of work on the identification of the best order of the Markov chain models has been done on rainfall data from various parts of the world, only a little attention has been given to the Malaysian data. No study has been conducted to determine the optimum order for the Markov chain models of the higher order for the daily rainfall in Peninsular Malaysia. Since the study of Quah and Ooi (1985), no further investigation has been done to determine the most appropriate order of the Markov chain models in representing the distribution of wet (dry) spells in this country. Hence, this study is conducted and it is structured as follows. After a brief description of the rainfall data in section Data and methods, the explanation on how to determine the optimum order of the Markov chain for n-arbitrary sequences is presented in the section Identifying the optimum order of Markov chain models for daily rainfall occurrences. In the section Persistency of wet (dry) events, the Markov models are described to explain the persistency of the wet (dry) events. This is followed by section Fitting wet (dry) spells using various order of Markov chain models where the most appropriate order of the Markov chain models is identified using the observed distribution of wet (dry) spells data. In sections Results and discussion and Conclusions, the results of the analysis and conclusion of this study are provided respectively.

2 Data and methods

Peninsular Malaysia lies entirely in the equatorial zone, situated in the northern latitude between 1^o and 6^oN and the eastern longitude from 100^o to 103^oE. The climate of Peninsular Malaysia is influenced by the Southwest monsoon, from May to August, and the Northeast monsoon, from November until February. The Southwest monsoon is a drier period for the whole country, while during the Northeast monsoon, the east coast and northern areas of Peninsular Malaysia receive more heavy rains than the other parts of the country.

The daily rainfall data from 18 selected stations for Peninsular Malaysia, as provided in Table 1, were selected based on the completeness of the data and the longer period of records. All the data used in this study were collected from the database of the Malaysian Meteorological Services (MMS). The number of years of record for these stations ranged from 21 to 35 years, where 10 stations had records of 35 years, and the rest of the stations had a length of records ranging from 21 to 32 years. The difference in the length of periods for the different stations is due to the availability of the data. Corresponding to the idea by Lim (1976) in this study, the peninsula is divided into five regions, to investigate whether the persistency of wet (dry) days varies according to the different regions of the country. Figure 1 displays a map of Peninsular Malaysia which is divided into the five regions, namely; northwest (NW), west (W), southwest (SW), east (E) and central (C).

Table 1 Geographic coordinates, altitude and period of records for the selected 18 rain gauged stations in Peninsular Malaysia. NW northwest; SW southwest; E east; C central; W west

Full size table

In this study, a wet day can be defined as a day when the rainfall amount exceeds or equals a particular threshold value. A day with the rainfall amount lesser than 0.1 mm is considered as a dry day, when 0.1 mm is assumed as the threshold value. Meanwhile, for the threshold of 10 mm, a dry day can be defined as a day with a small amount of rainfall, less than 10 mm. Otherwise, the day is considered as a wet day. The purpose of applying two different levels of threshold is to investigate whether the optimum order of the Markov chain model is dependent on the level of threshold used.

3 Identifying the optimum order of Markov chain models for daily rainfall occurrences

Let X ₁ ,X ₂ ,…,X _t ,…X _n denote n binary variables to represent the sequences of wet and dry events in the daily rainfall occurrence for n–arbitrary days, indicated as 1 and 0, respectively. The sequence of a wet (dry) day is assumed to follow a Markov chain model of the first order in the events of wet or dry at time t, when X _t, depends on the previous event X _t–1. Let P ₁₀ denotes the conditional probability of a wet day following a dry day and P ₁₁ denotes the conditional probability of a wet day following a wet day. Then these two conditional probabilities can be given by

$$P_{{10}} = P{\left( {X_{{\text{t}}} = 1\left| {X_{{t - 1}} = 0} \right.} \right)}\quad{\text{and}}\quad P_{{11}} = P{\left( {X_{t} = 1\left| {X_{{t - 1}} = 1} \right.} \right)}$$

(1)

For the Markov chain model of k ^th order, the stationary transition probabilities are given by

$$P_{i_{k + 1} , \ldots ,i_1 } = P\left( {X_t = i_{k + 1} \left| {X_{t - 1} = i_k ,\; \ldots ,X_{t - k} = i_1 } \right.} \right);\;i = 0,1$$

(2)

The joint probability distribution for X ₁,X ₂,…,X _t,…,X _n can be given by

$$P\left( {X_n = i_n , \ldots ,X_1 = i_1 } \right) = P\left( {i_n ,i_{n - 1} , \ldots ,i_1 } \right)$$

(3)

Equation (2) can also be written as

$$\begin{aligned} & P{\left( {i_{n} ,i_{{n - 1}} , \ldots ,i_{1} } \right)} = P{\left( {\left. {i_{n} } \right|i_{{n - 1}} , \ldots ,i_{{n - k}} } \right)}P{\left( {\left. {i_{{n - 1}} } \right|i_{{n - 2}} , \ldots ,i_{{n - k - 1}} } \right)} \ldots P{\left( {\left. {i_{{k + 1}} } \right|i_{k} , \ldots ,i_{1} } \right)}P{\left( {\left. {i_{k} } \right|i_{{k - 1}} , \ldots ,i_{1} } \right)} \ldots P{\left( {\left. {i_{2} } \right|i_{1} } \right)}P{\left( {i_{1} } \right)} \\ & \quad \quad \quad \quad \quad \quad = P_{{i_{1} }} P_{{i_{2} i_{1} }} \ldots P_{{i_{k} , \ldots ,i_{1} }} {\prod\limits_{j = 1}^{n - k} {P_{{i_{{k + j}} , \ldots ,i_{j} }} } } \\ \end{aligned} $$

(4)

where $P_{i_1 } = P\left( {i_1 } \right)$.

In order to obtain the optimum order of the Markov chain model, AIC and BIC can be applied. Both criteria are based on the log-likelihood functions for the transition probabilities of the fitted Markov chain models. The maximum likelihood function for the k ^th order chain is given by

$$L_k \left( {X_1 , \ldots ,X_n } \right) = \prod\limits_{s_k , \ldots ,s_1 } {\hat P_{s_k , \ldots ,s_1 }^{n_{_{s_k , \ldots ,s_1 } } } } $$

(5)

where, $\hat P_{s_k , \ldots ,s_1 }^{n_{s_k , \ldots ,s_1 } } $ denotes the estimated transition probabilities of each of the sequence going from state s₁ to s₂, from state s₂ to s₃, and out to state s_k–1 to s_k, where s_k is the state of the most recent observation; and the superscript $n_{s_k , \ldots ,s_1 } $ denotes the associated transition counts. The maximum likelihood estimator used in Eq. (5) of the transition probabilities of Eq. (2) is given by,

$$\ifmmode\expandafter\hat\else\expandafter\^\fi{P}_{{s_{{\text{k}}} , \ldots ,s_{1} }} =\frac{{n_{{s_{{\text{k}}} , \ldots ,s_{1} }} }}{{n_{{ \bullet s_{{{\text{k}} - 1}} , \ldots ,s_{1} }} }}$$

(6)

with $n_{{ \bullet s_{{{\text{k}} - 1}} , \ldots ,s_{1} }} = {\sum\limits_{s_{{\text{k}}} } {n_{{s_{{\text{k}}} , \ldots ,s_{1} }} } }$

The comparison of the two different Markov chain models to decide on the optimum order, say the Markov chain models of k ^th and r ^th order where k < r and k = 0,1,…,r–1 can be based on the maximized likelihood ratio statistics which are computed as

$$\eta _{k,r} = - 2\ln \lambda _{k,r} $$

(7)

where $\lambda _{k,r} = \frac{{L_k \left( {X_1 , \ldots ,X_n } \right)}}{{L_r \left( {X_1 , \ldots ,X_n } \right)}}$

The selection of the Markov chain model of the optimum order, say k ^th order, involves the use of minimum loss function as obtained from the AIC or BIC. When assessing the k ^th order of the Markov chain model, Tong (1975) proposed the loss function as denoted by ${\text{AIC}}\;(k)$ to define the risk on the basis of the AIC criteria which can be given by

$$AIC\left( k \right) = \eta _{k,r} - 2\nu $$

(8)

where $\nu = \left( {S^r - S^k } \right)\left( {S - 1} \right)$ is the degree of freedom and S represents the number of states. Schwarz (1978) introduced a different form of loss function to define the risk on the basis of the BIC criteria, denoted as BIC(k), which can be given by

$$BIC\left( k \right) = \eta _{k,r} - v\ln \left( n \right)$$

(9)

where n is the sample size. The only difference between the two criteria is the form of the penalty function. These criteria attempt to find the value of k that minimizes the loss functions. When identifying the optimum order of the Markov chain models in the daily rainfall occurrences, these criteria were also applied by many researchers (e.g., Chin 1977; Katz 1981; Wilks 1995; Jimoh and Webster 1996; Harrison and Waylen 2000; Kottegoda et al. 2004; Hui et al. 2005).

4 Persistency of wet (dry) events

The existence of persistence in the sequences of weather events can be measured by Besson’s coefficient of persistence (BCP) which can be defined as follows:

$$BCP = \frac{{1 - P_1 }}{{1 - P_{11} }} - 1$$

(10)

where P ₁ is the probability of a wet day and P ₁₁ is the conditional probability of a wet day given that the previous day was also a wet day. The positive value of BCP indicates the occurrence of a wet (or dry) event followed by an immediate preceding event. A zero BCP value reflects that there is no persistence for a similar event. The details of the derivation of BCP was explained by Besson (1924) based on an analysis of the persistence of the daily rainfall sequences in Paris. This method was applied by many other researchers (e.g., Erikson 1965; Singh et al. 1981; Berger and Goosens 1983; and Dahale et al. 1994).

5 Fitting wet (dry) spells using various order of Markov chain models

A wet (dry) spell is defined as a period of consecutive days of exactly, say, n wet (dry) days, occurring exactly before a dry (wet) day and returning to the wet (dry) condition in the $(n + 1)^{th} $ day. The first-order Markov chain only takes into account the condition of the state, either wet or dry, for one preceding day. Similarly, the second order considers the states of the two preceding days and so on. The Markov chain models up to order four has been applied in this present study. The joint probabilities of the k ^th order of the Markov chain models, are defined as,

$$P\left( {i_n , \ldots ,i_2 \left| {i_1 } \right.} \right) = P_{i_2 i_1 } \ldots P_{i_{k - 1} \ldots i_1 } \prod\limits_{j = 1}^{n - k} {P_{i_{k + j} , \ldots ,i_j } } ,\quad n \geqslant k + 1;\quad k = 0,1 \ldots $$

(11)

The conditional probabilities of a spell of n wet days under the k ^th order Markov chain, which is denoted as $P\left( {0,\left[ n \right]\left| 0 \right.} \right)$, are given by,

$$P\left( {0,\left[ n \right]\left| 0 \right.} \right) = \left\{ \begin{aligned} & \prod\limits_{i = 1}^n {P_{\left[ i \right]0} P_{0\left[ n \right]0} \quad \quad ,\quad n \leqslant k - 1} \\ & \prod\limits_{i = 1}^k {P_{\left[ i \right]0} P_{\left[ {k + 1} \right]}^{n - k} P_{0\left[ k \right]} \;,\quad n \geqslant k} \\ \end{aligned} \right.$$

(12)

where [n] represents the n times. For example, the conditional probability of two consecutive wet days can be written as $P\left( {011\left| 0 \right.} \right)$.

In order to compute the expected number of wet days, the conditional probabilities according to the respective length, say, 1, 2, 3, …, n days, which is obtained from Eq. (12) are multiplied by the total number of dry days. The Kolmogorov-Smirnov goodness-of-fit test is employed to compare the observed distribution and the expected distributions of the wet spells for the first few orders of the Markov chain models. The maximum absolute difference between the two cumulative values of the observed and expected number of wet days under the assumed Markov chain models are computed and if these values are found to be less than or equal to the critical value D _0.05, the particular Markov chain models are considered as most appropriate for describing the distribution of the wet spells. A similar approach is applied to the sequences of dry days.

6 Results and discussion

6.1 Characteristics of the sequence of wet (dry) days

The conditional probabilities of a wet (dry) event at each station are estimated for both levels of threshold and averaged out according to the regions and monsoon seasons. In Table 2, it is found that for both the monsoon seasons and levels of threshold, the conditional probability of a wet day given the previous day was wet, P ₁₁, is substantially higher than the probability of wet, P ₁, for all the stations. When the levels of persistence of wet spells are compared based on the two levels of threshold, the persistence level is found to be higher for the threshold level of 0.1 mm. For threshold 10 mm, the difference between the successive conditional probabilities of wet days reduces dramatically and becomes negligible after about 2 or 3 days; however, this difference reduces after about 4 or 5 days for the threshold 0.1 mm. Similarly, the conditional probabilities of dry events show that the probability P ₀₀ is substantially higher than P ₀. The difference between the successive conditional probabilities of dry days reduces progressively and becomes negligible after about 5 days and about 2 or 3 days for thresholds 0.1 and 10.0 mm respectively. As expected, the lower level of threshold value will produce a slightly longer duration of wet (dry) periods. The analysis of intense wet spells (e.g., threshold 10 mm) shows that the conditional probabilities of the wet spells, P ₁₁, P ₁₁₁, P ₁₁₁₁ and P ₁₁₁₁₁ are lower than those of the dry spells, P ₀₀, P ₀₀₀, P ₀₀₀₀ and P ₀₀₀₀₀ during both the monsoon seasons at all regions. In addition, the conditional probabilities of $P_{1111} $ and $P_{11111} $ are found decreased over the southwestern, eastern and the central parts of Peninsular Malaysia for the Southwest monsoon and also in the northwestern areas during the Northeast monsoon. This may be due to the fact that the intense wet spells seldom lasts for more than four consecutive days in most regions over the peninsula. Thus, it is evident to consider the fourth order Markov (r = 4) as the maximum order of the Markov process. According to Chin (1977), the occurrence of wet (dry) events is highly unlikely to have more than the fourth-order conditional dependence.

Table 2 Persistency characteristics of daily rainfall occurrence for both monsoons at five regions in Peninsular Malaysia during the period of records for both levels of threshold, 0.1 mm^a 10.0 mm^b. NW northwest; SW southwest; E east; C central; W west

Full size table

Further analysis of the persistency of the wet (dry) events will be explored by applying Besson’s coefficient of persistence (BCP). When the Northeast monsoon is considered, the results in Table 2 indicate that the BCP value for the stations located in the eastern region is the highest (0.74) when compared to the other regions; however, it shows the lowest value of 0.15 during the Southwest monsoon for both levels of threshold. For a lower level of threshold used, the average length of wet and dry spells during both monsoon seasons at all regions ranges between 3 to 4 days and 3 to 5 days, respectively. However, for a higher level of threshold used, the average length of dry spells of 6–12 days, is observed to be much longer than that of the wet spells, which is 1–2 days. Obviously, the results indicate that the persistence of dry sequences during the Northeast monsoon at the northwestern areas is much longer if compared to the other regions. The results of the study seem to be in agreement with the findings of Dale (1960), Camerlengo (1999) and Deni et al. (2008) who found that the dry spells were largely dependent on latitude and were longer and more frequent in the north than in the south and were more frequent during the Northeast monsoon.

6.2 The optimum order of Markov chain models

In determining the optimum order of the Markov chain models for a 5-day sequence in the daily rainfall occurrence, the minimum loss function obtained from the two decision criteria, namely; the AIC and BIC, will be applied. As an example, Table 3 illustrates that the loss function obtained from the AIC and BIC (see Eqs. 8 and 9) consistently shows that the third order is found optimum in almost all the stations for threshold 0.1 mm. However, for a higher level of threshold used, e.g. 10.0 mm, based on the BIC, the optimum order for the stations located in the northwestern and western areas is found to be slightly lower than that of the AIC. For the rest of the analysis, Table 4 displays the summary of the optimum order for a 5-day sequence of wet days at each region considering both the levels of threshold used and both the monsoon seasons. It can be seen that the optimum order obtained from both, decision criteria varies with season, region and threshold used. Generally, for both the AIC and BIC and the thresholds considered, the Markov chain of higher order is found to be optimum at most stations during the Northeast monsoon. Moreover, regardless of the monsoon seasons, for the two levels of threshold used, the results show that the Markov chain model order higher than one is found to be optimum at all stations in the peninsula. However, for the Southwest monsoon, the results show that the zero and the first-order model are found to be optimum at most of the stations located in the northwestern and eastern areas, when the higher level of threshold of 10 mm, is considered. It is also remarkable that the optimum order estimated by the AIC is either greater or equals to the BIC at most stations for thresholds 0.1 and 10 mm. The findings of this present study is supported by Katz (1981) who claimed that the optimum order obtained based on the AIC was inconsistent and had a tendency to over-estimate the true optimum order, whereas the BIC produced more consistent results.

Table 3 The loss function obtained from AIC^a and BIC^b for the Markov chain model of k ^th order, with r = 4

Full size table

Table 4 The optimum order of Markov chain models for the 5-day sequence at five regions in Peninsular Malaysia over the period of records. NW northwest; SW southwest; E east; C central

Full size table

6.3 The most appropriate order for the wet (dry) spells

The spatial distribution of the most appropriate order of Markov chain models to explain the wet (dry) spells (see section Fitting wet (dry) spells using various order of Markov chain models) over Peninsular Malaysia is identified through the Kolmogorov-Smirnov goodness-of-fit test. Figures 2 and 3 provide the observed and expected frequencies for both levels of threshold based on the most appropriate order of the first four orders of the Markov chain models identified as MC1, MC2, MC3 and MC4 for the distribution of the dry and wet spells for each of the four selected stations, respectively. It is obvious that the expected frequencies of dry and wet spells which are obtained from the most appropriate order of the Markov chain models really describe the observed distributions. It is also found that a higher order of the Markov chain models are observed to be more appropriate in order to describe the distribution of dry spells at the four selected stations, for both levels of threshold considered, as shown in Fig. 2. Meanwhile, Fig. 3 indicates that the first and second order are found to be the most appropriate models to describe the distribution of wet spells at each of the four selected stations.

Figures 4 and 5 display the results of the analyses on the most appropriate order for distribution of the wet and dry spells respectively at each of the selected rainfall stations in Peninsular Malaysia. As shown in Fig. 4, it is generally found that the Markov chain of higher order is appropriate for describing the distribution of dry spells during the Northeast monsoon at most stations except for the stations at Subang and Petaling. However, during the Southwest monsoon, for both levels of threshold, the first-order model is found appropriate at all stations except for the four stations located in Subang, Sitiawan, Ipoh and Temerloh. Regardless of the monsoon season, the results for both levels of threshold consistently show that the Markov chain of higher order is found to be appropriate for representing the distribution of dry spells at all stations in Peninsular Malaysia.

Figure 5 displays the most appropriate order of the first four orders of the Markov chain models in representing the distribution of wet spells for each of the rainfall stations considered in this study. For the higher level of threshold applied, Fig. 5d–f indicates that the first-order model is found to be the most appropriate in representing the distribution of intense wet spells in most of the stations over the peninsula. Moreover, for the lower threshold level of 0.1 mm, the Markov chain of higher order is found to be more appropriate in the eastern, western and central parts of the peninsula, regardless of the monsoon seasons. Generally, the first-order Markov chain model is found more appropriate in describing the distribution of the wet spells at most regions during the Northeast and Southwest monsoons except in the western areas for the lower level of threshold of 0.1 mm. It can be concluded here that the Markov chain of higher order is more appropriate for the dry spells than for the wet spells. Similar findings had also been found by Dahale et al. (1994).

7 Conclusions

The information of wet (dry) behavior is vitally important to the water resource management which can be used for the prediction of extreme weather events. The analysis of persistency of wet (dry) spells contributes to the information of the occurrence of a wet or dry event based on the previous day. The results obtained from the analysis will provide a more complete description on the rainfall occurrences behavior in Peninsular Malaysia. In this present study, the Markov chain models of various orders were applied to the daily rainfall data in order to identify the optimum order of the model which was able to represent the wet (dry) spells by considering the various geographical regions, different levels of threshold used and also the monsoon seasons.

It can be concluded from this study that the optimum order of the Markov chain models for the 5-day of daily rainfall occurrence for Peninsular Malaysia varies according to the levels of threshold used, the regions and types of monsoon seasons. Generally, for all cases, the results show that the optimum order obtained by using the AIC is higher compared to the results of the alternative method BIC. The findings support the results of simulation theory of Katz (1981), which indicated that the AIC produced inconsistent results compared to the BIC. In addition, Jimoh and Webster (1996) also claimed that the estimation of the optimum order using the AIC tended to overestimate the results. Moreover, the order greater than one is more appropriate for Peninsular Malaysia for both thresholds regardless of the monsoon seasons and using both the decision criteria, AIC and BIC. Generally, the lower order of the Markov chain models is found to be optimum in the northwestern and the eastern areas during the Southwest monsoon season for a higher level of threshold, 10 mm.

However, the results show that the optimum orders for the Northeast monsoon are equal to or higher than those of the Southwest monsoon, when both levels of threshold are considered. This present study seems to be in agreement with Dahale et al. (1994) who claimed that the closer the stations are to the equator the higher the order of the models. Their findings indicated that the southern stations in Thailand and the equatorial Pacific which are closer to the equator showed a higher order of the Markov chains. Since the higher order of the Markov chain models is found to be optimum for the Northeast monsoon, it is shown in this present study that for the dry spells, the order higher than one is also found to be more appropriate in most of the stations in the peninsula for this monsoon season. This also supports the analysis of persistency which indicates that the BCP of threshold 0.1 mm during the Northeast monsoon is higher when compared to the Southwest monsoon. Thus, the results may provide evidence that the higher the order of the chain, the greater the consideration of the persistence of wet (dry) events. On the other hand, for the wet spells, the first order of the Markov chain model is found to be the most appropriate order in most of the stations over the peninsula for both monsoon seasons and levels of threshold considered.

Further analysis could be carried out by including other rainfall stations in Malaysia so that more valuable information could be found for the purpose of prediction. In the future, the most appropriate probability models by comparing the various orders of the Markov models with the alternative rainfall occurrence models in representing the distribution of wet (dry) spells will be investigated further over Peninsular Malaysia by considering the monsoon seasons and other climatic issues such as global warming or climate change.

References

Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automat Contr 19:716–723
Article Google Scholar
Anagnostopoulou CHR, Maheras P, Karacostas T, Vafiadis M (2003) Spatial and temporal analysis of dry spells in Greece. Theor Appl Climatol 74(1–2):77–91
Article Google Scholar
Basu AN (1971) Fitting of a Markov chain model for daily rainfall data at Calcutta. Indian J Meteorol Geophys 22:67–74
Google Scholar
Berger A, Goosens C (1983) Persistence of wet and dry spells at Uccle (Belgium). J Climatol 3:21–34
Article Google Scholar
Besson L (1924) On the probability of rain. Mon Wea Rev 52:308
Article Google Scholar
Camerlengo AL (1999) Monthly frequency distributions of both dry spells and the number of days with precipitation greater than 25 mm over Peninsular Malaysia. GEOACTA 23:1–18
Google Scholar
Chapman TG (1997) Stochastic models for daily rainfall in the Western Pacific. Math Comput Simul 43(3):351–358
Article Google Scholar
Chin EH (1977) Modeling daily precipitation occurrence process in Markov chain. Water Resour Res 13(6):949–956
Article Google Scholar
Dahale SD, Panchawagh N, Singh SV, Ranatunge ER, Brikshavana M (1994) Persistence in Rainfall Occurrence over Tropical South East Asia and Equatorial Pacific. Theor Appl Climatol 49:27–39
Article Google Scholar
Dale WL (1960) The rainfall of Malaya, part II. J Trop Geogr 14:11–28
Google Scholar
Deni SM, Jemain AA, Ibrahim K (2008) The spatial distribution of wet and dry spells over Peninsular Malaysia. Theor Appl Climatol (in press). doi:10.1007/s00704-007-0355-8
Eriksson B (1965) A climatological study of persistency and probability of precipitation in Sweden. Tellus 17:484–497
Article Google Scholar
Fleming SW (2007) Climatic influences on Markovian transition matrices for Vancouver daily rainfall occurrence. Atmos Ocean 45(3):163–171
Article Google Scholar
Gabriel KR, Neumann J (1962) A Markov chain model for daily rainfall occurrence at Tel Aviv. Quart J Roy Meteorol Soc 88:90–95
Article Google Scholar
Gates F, Tong H (1976) On Markov chain modeling to some weather data. J Appl Meteorol 15:1145–1151
Article Google Scholar
Harrison M, Waylen P (2000) A note concerning the proper choice for Markov model order for daily precipitation in the humid tropics: a case study in Costa Rica. Int J Climatol 20:1861–1872
Article Google Scholar
Hayhoe HN (2000) Improvements of stochastic weather data generators for diverse climates. Clim Res 14:75–87
Article Google Scholar
Hui W, Xuebin Z, Elaine MB (2005) Stochastic modeling of daily precipitation for Canada. Atmos Ocean 43(1):23–32
Article Google Scholar
Jimoh OD, Webster P (1996) Optimum order of Markov chain for daily rainfall in Nigeria. J Hydrol 185:45–69
Article Google Scholar
Katz RW (1981) On some criteria for estimating the order of Markov chain. Technometrics 23:243–249
Article Google Scholar
Kottegoda NT, Natale L, Raiteri E (2004) Some considerations of periodicity and persistence in daily rainfalls. J Hydrol 296:23–37
Article Google Scholar
Lim JT (1976) Rainfall minimum in Peninsular Malaysia during the Northeast monsoon. Mon Wea Rev 104:96–99
Article Google Scholar
Martin-Vide J, Gomez L (1999) Regionalization of Peninsular Spain based on the length of dry spells. Int J Climatol 19:537–555
Article Google Scholar
Moon SE, Ryoo SB, Kwon JG (1994) A Markov chain model for daily precipitation occurrence in South Korea. Int J Climatol 14(9):1009–1016
Article Google Scholar
Quah LC, Ooi MK (1985) Wet and dry spell probability models in West Malaysia. Research Publication no. 14. Malaysian Meteorological Service, Petaling Jaya, Malaysia
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article Google Scholar
Singh SV, Kripalani RH, Priya S, Ismail PMM, Dahale SD (1981) Persistence in daily and 5-day summer monsoon rainfall over India. Arch Meteor Geophys Bioclim Ser A 30:261–277
Article Google Scholar
Stern RD, Coe R (1984) A model fitting analysis of daily rainfall data. J R Stats Soc A 147(1):1–34
Article Google Scholar
Tolika K, Maheras P (2005) Spatial and temporal characteristics of wet spells in Greece. Theor Appl Climatol 81(1–2):71–85
Article Google Scholar
Tong H (1975) Determination of the order of a Markov chain by Akaike’s information criterion. J Appl Prob 12:488–497
Article Google Scholar
Wilks DS (1995) Statistical methods in atmospheric sciences: an introduction. Academic, London
Google Scholar
Wilks DS (1999) Interannual variability and extreme-value characteristics of several stochastic daily precipitation models. Agric For Meteorol 93(3):153–169
Article Google Scholar

Download references

Acknowledgements

The authors are indebted to the staff of the Malaysian Meteorological Services for providing the daily rainfall data for this study. They also acknowledge their sincere appreciation to both reviewers for their valuable suggestions and remarks in improving this manuscript.

Author information

Authors and Affiliations

Center of Statistical Studies, Faculty of Information Technology and Quantitative Science, Universiti Teknologi MARA (UiTM), 40450, Shah Alam, Selangor, Malaysia
Sayang Mohd Deni
School of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia (UKM), 43600, Bangi, Selangor, Malaysia
Abdul Aziz Jemain & Kamarulzaman Ibrahim

Authors

Sayang Mohd Deni
View author publications
You can also search for this author in PubMed Google Scholar
Abdul Aziz Jemain
View author publications
You can also search for this author in PubMed Google Scholar
Kamarulzaman Ibrahim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sayang Mohd Deni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Deni, S.M., Jemain, A.A. & Ibrahim, K. Fitting optimum order of Markov chain models for daily rainfall occurrences in Peninsular Malaysia. Theor Appl Climatol 97, 109–121 (2009). https://doi.org/10.1007/s00704-008-0051-3

Download citation

Received: 06 December 2007
Accepted: 12 July 2008
Published: 23 August 2008
Issue Date: June 2009
DOI: https://doi.org/10.1007/s00704-008-0051-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Fitting optimum order of Markov chain models for daily rainfall occurrences in Peninsular Malaysia

Abstract

Similar content being viewed by others

Evaluation of Stochastic Daily Rainfall Data Generation Models

Selection of the best probability models for daily annual maximum rainfalls in Egypt

Variations of characteristics of consecutive rainfall days over northern Thailand

1 Introduction

2 Data and methods

3 Identifying the optimum order of Markov chain models for daily rainfall occurrences

4 Persistency of wet (dry) events

5 Fitting wet (dry) spells using various order of Markov chain models

6 Results and discussion

6.1 Characteristics of the sequence of wet (dry) days

6.2 The optimum order of Markov chain models

6.3 The most appropriate order for the wet (dry) spells

7 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fitting optimum order of Markov chain models for daily rainfall occurrences in Peninsular Malaysia

Abstract

Similar content being viewed by others

Evaluation of Stochastic Daily Rainfall Data Generation Models

Selection of the best probability models for daily annual maximum rainfalls in Egypt

Variations of characteristics of consecutive rainfall days over northern Thailand

1 Introduction

2 Data and methods

3 Identifying the optimum order of Markov chain models for daily rainfall occurrences

4 Persistency of wet (dry) events

5 Fitting wet (dry) spells using various order of Markov chain models

6 Results and discussion

6.1 Characteristics of the sequence of wet (dry) days

6.2 The optimum order of Markov chain models

6.3 The most appropriate order for the wet (dry) spells

7 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation