Abstract
The interpretations of trend behaviour for dry and wet events are analysed in order to verify the dryness and wetness episodes. The fitting distribution of rainfall is computed to classify the dry and wet events by applying the standardised precipitation index (SPI). The rainfall amount for each station is categorised into seven categories, namely extremely wet, severely wet, moderately wet, near normal, moderately dry, severely dry and extremely dry. The computation of the SPI is based on the monsoon periods, which include the northeast monsoon, southwest monsoon and inter-monsoon. The trends of the dry and wet periods were then detected using the Mann–Kendall trend test and the results indicate that the major parts of Peninsular Malaysia are characterised by increasing droughts rather than wet events. The annual trends of drought and wet events of the randomly selected stations from each region also yield similar results. Hence, the northwest and southwest regions are predicted to have a higher probability of drought occurrence during a dry event and not much rain during the wet event. The east and west regions, on the other hand, are going through a significant upward trend that implies lower rainfall during the drought episodes and heavy rainfall during the wet events.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Precipitation or rainfall is the primary factor which controls the formation and persistence of droughts and floods. A drought is characterised by a deficiency of water supply over an extended period of time, while a flood is an overflow of water that submerges the land. Droughts and floods are extreme events which may adversely affect social, economic, political, cultural and other functions of a region. Therefore, there have been many studies conducted on extreme events, particularly in characterising the events into dry and wet categories (Sirdas and Sen 2001; Deni et al. 2009; Zhang et al. 2009).
Meteorological droughts, defined as a lack of precipitation in a region over a period of time, are the main focus of this paper. The standardised precipitation index (SPI) (e.g. Bordi et al. 2009) is the most commonly used method to reveal a meteorological drought and is also a successful tool in the estimation of the intensity and duration of drought events. The SPI has been widely used to quantify the precipitation deficit in terms of the probability for multiple time scales, which are designed to reflect the impacts of precipitation deficits on different water resources (McKee et al. 1993). SPI is originally calculated for 3-, 6-, 12-, 24- and 48-month time scales. It is a classification system which is normalised so that drier and wetter climates can be represented in the same way (Sirdas and Sen 2001). As such, it can be used to monitor dry as well as wet periods where negative values indicate drought while positive values indicate wet conditions based on a dimensionless index of SPI (Sonmez et al. 2005).
An analysis of rainfall characteristics is an important component in managing water resources with the development of industrialisation as well as rapid growth in the population (Deni et al. 2009). Therefore, it is of scientific and practical merit to better understand the varying characteristics of dryness and wetness for predicting and preventing disasters brought about by extreme events. The trend analysis for dry and wet spells is an important element for climate change issues. The Intergovernmental Panel on Climate Change has established a series of reports that summarise the observed climate changes and project future changes, which have emphasised that global warming is a serious issue for national and international security due to the wide spectrum of consequences to the resilience of the population support system (health, energy, water and food security), human security (population dislocation and armed conflict) and political continuity (continuity of governance and economic viability) (Department of Defense 2011). The trend characteristics or persistency will contribute to the prediction of future climatic events due to the dependency on extreme weather events such as drought, flood and landslides (Deni et al. 2009). These contributions will be beneficial for dealing with the global warming issue as the trend identification for dry and wet events is helpful in predicting the future drought and flood episodes in order to have an efficient system and planning in mitigating the negative impacts and reducing the global warming influences.
Hence, this study will focus on characterising the rainfall into dry and wet events using SPI in which the percentages of dryness and wetness could also be evaluated. The objectives of this study are: (1) to obtain the best-fitted distribution in representing the rainfall of Peninsular Malaysia; (2) to detect the properties of dry and wet episodes defined by SPI; (3) to determine the percentage of dry and wet events in each state of Peninsular Malaysia; and (4) to verify the trend for dry and wet events.
2 Study region and data
Peninsular Malaysia (100°E–104°E; 1°N–7°N), also known as West Malaysia, covers an area of about 131,598 km2 (Fig. 1). The climate is influenced by two monsoons, namely the northeast monsoon from November to February and the southwest monsoon from May to August. Inter-monsoon periods (March to April and September to October) usually bring more rainfall in the central region.
The daily precipitation dataset, which covers the period from November 1975 to October 2008, was obtained from 75 rain gauge stations in Peninsular Malaysia. The data used are considered good quality data with no missing values throughout the 33-year period. The rain gauge locations are categorised into four regions, namely east, southwest, west and northwest according to their geographical coordinates, respectively. The categorisation is based on the significant variation for different elements among these four regions such as the trend effects and the impacts of adjoining wet days (Suhaila and Jemain 2009; Deni et al. 2009). Location of the rain gauge stations can be seen in Fig. 1 and Table 1. The eastern and western regions are separated by the main range (Banjaran Titiwangsa), which runs from the far north to the south of Peninsular Malaysia. The main range has significant influence on the spatial rainfall pattern by shading more rain on the east coast during the northeast monsoon due to rain shadow effect. Likewise, the southwest monsoon brings more rain on the west coast (Suhaila and Jemain 2009).
3 Methodology
3.1 Types of probability distributions
Fitting of distribution for the rainfall amount is computed on behalf of all stations under study, and three types of distributions are selected in fitting based on the relation for SPI determination. Gamma distribution is generally assumed to be fitted in SPI calculation (McKee et al. 1993), Weibull distribution is identified as the heavy-tailed distribution in precipitation fitting (Yusof and Hui-Mean 2012), while lognormal distribution is verified as the best-fitted distribution for further SPI computation (Zhang et al. 2009).
Gamma distribution is a two-parameter family, with the probability density function (PDF) and cumulative distribution function (CDF) written as follows:
for x > 0 and α,β > 0where \( \varGamma \left( \alpha \right)=\int\limits_0^{\infty } {{x^{{\alpha -1}}}{e^{-x }}dx} \) and \( \gamma \left( {s,x} \right)=\int\nolimits_0^x {{t^{s-1 }}{e^{-t }}dt} \) is the lower incomplete gamma function, α is the shape parameter, and β is the scale parameter.
Weibull distribution is a continuous probability with the PDF and CDF written as follows:
for x > 0 and α,β > 0, where α is the shape parameter, and β is the scale parameter.
Lognormal distribution is a probability distribution of a random variable in which the logarithmic function is normally distributed in the probability theory. The PDF and CDF are written as:
for x > 0, μ∈R and σ 2 > 0 where \( erfc(x)=\frac{2}{{\sqrt{\pi }}}\int\nolimits_x^{\infty } {{e^{{-{t^2}}}}} dt \) is the complementary error function, μ is the location parameter, and σ 2 is the squared scale parameter.
3.2 Parameter estimation
The parameter estimations in this study are interpreted using maximum likelihood estimation (MLE), since it is a commonly used statistical method in fitting a statistical model to data.
Suppose x is a continuous random variable with PDF f(x;θ 1,θ 2,…,θ k ), where θ 1,θ 2,…,θ k are k unknown constant parameters which need to be estimated with an experiment conducted to obtain N independent observations, x 1,x 2,…,x N . The estimation of parameters, \( {{\widehat{\theta}}_1},{{\widehat{\theta}}_2},\ldots,{{\widehat{\theta}}_k} \), can be obtained by solving the differentiation of logarithmic likelihood function below:
where j = 1,2,…,k
3.3 Goodness-of-fit tests
It is generally desirable to test the compatibility of a model and data by a statistical goodness-of-fit (GOF) test. GOF tests are used to describe the fitness of a distribution to a set of observations, and measures. GOF typically summarises the discrepancy between observed and expected values under the model in question. The best-fitted distribution will be chosen based on the minimum error produced. These errors may be measured by the following techniques.
Akaike Information Criterion (AIC) is derived by minimising the Kullback Leibler distance between the proposed model with the actual model. The formula of AIC is given as follows:
where k is the number of parameters in the statistical model, and L is the maximised value of the likelihood function for the estimated model.
The Kolmogorov–Smirnov (KS) test was used to determine the maximum difference between the hypothesised distribution with the empirical distribution. The KS test statistic is defined as:
where x i is the increasing ordered data, F is the theoretical cumulative distribution, and N is the sample size.
Anderson–Darling (AD) was applied to compare the model fitting between an observed CDF with an expected CDF. For AD tests, the definition is as follows:
where F is the CDF of the specified distribution and X i are the ordered data.
3.4 Standardised Precipitation Index
The SPI was designed by McKee et al. (1993) to quantify the precipitation deficit on multiple time scales. For SPI calculation, a long-term precipitation record at the desired station is first fitted to a probability distribution which can be obtained by applying GOF tests and then transformed into a normal distribution where the mean SPI is zero (McKee et al. 1993). A positive SPI indicates that the observed precipitation is greater than the mean precipitation, while a negative SPI indicates the contrary. It is also reported that the SPI can be used to monitor both dry and wet conditions (Morid et al. 2006).
The alternative method of SPI calculation with the precipitation data fitted to a gamma distribution for each time-scale can be expressed as a function involving the rainfall amount (x i ), the mean precipitation value (\( \overline{x} \)), and the standard deviation precipitation (s) (McKee et al. 1993; Sirdas and Sen 2001; Bacanli et al. 2008; Khan et al. 2008).
The SPI application with the precipitation data fitted to a lognormal distribution can be simplified as the difference between logarithmic transformation of the dataset (ln(x)), and the sample mean of the transformed data (\( {{\widehat{\mu}}_y} \)) divided by the sample standard deviation (\( {{\widehat{\sigma}}_y} \)) (Zhang et al. 2009).
The drought and wetness severity applied in this study is defined in Table 2. The sample mean and standard deviation that are used to normalise the probability distribution in determining the SPI values are based on the monsoon period, since the climate changes in Peninsular Malaysia are commonly affected by the monsoon season. Hence, the sample set is defined by the monsoon periods, specially the northeast monsoon (November–February), southwest monsoon (May–August) and inter-monsoon (March–April and September–October), in order to determine the sample mean and standard deviation for SPI computation.
3.5 Mann–Kendall trend test
The Mann–Kendall trend test (Mann 1945; Kendall 1975) is used to measure the trend for drought and wet events with respect to SPI, and is measured by the correlation between the ranks of observations and their time sequences (Hamed 2009).
For a time-series {x t :t = 1,2,…,n}, the test statistic S is calculated as:
where \( {a_{ij }}=sign=\left( {{x_j}-{x_i}} \right)\left\{ {\begin{array}{*{20}c} 1 \hfill \\ 0 \hfill \\ {-1} \hfill \\ \end{array}} \right. \) \( \begin{array}{*{20}c} {;{x_i} < {x_j}} \hfill \\ {;{x_i}={x_j}} \hfill \\ {;{x_i} > {x_j}} \hfill \\ \end{array} \) and n is the sample size.
The value of mean and variance of S are calculated under the assumption that the data are independent and identically distributed as follows:
The variance of S is reduced with the existence of tied ranks or equal observations in the data:
where m is the number of groups of tied ranks, each with t j tied observations.
The standardised statistics (Z) for the one-tailed test are formulated as follows:
The null hypothesis of no trend is rejected if |Z| > 1.96 at the 5 % significance level.
3.6 Kriging method
Kriging is a geostatistical method which is applicable with the assumption of the distances or direction between sample points that reflect a spatial correlation that can be employed in explaining the variation in the surface. Kriging suits a mathematical function to a certain number of points which can be also fitted within a specified radius in determining the output value for the entire region. Kriging is a multistep process that involves an exploratory statistical analysis of the dataset, variogram modelling and surface creation. The weights of the kriging method rely on the distance between the measured points with the prediction location and overall spatial arrangement of the measured points.
Kriging analyses the measurement of values surrounded to define a prediction for the unmeasured region. Hence, the trend pattern for the entire region of Peninsular Malaysia will be determined using the kriging method.
The general formula for the interpolator is recognised as a weighted sum of the data, which include the measured value at the ith location, Z(s i ), an unknown weight for the measured value at the ith location, λ i , the prediction location, s 0, and the number of measured values, N.
A general approach in solving with the kriging system equation is to apply the semivariogram function, ϒ(h) (Merino et al. 2001). The estimation of semivariogram can be obtained based on the following equation:
where Z(s i ) and Z(s i+h ) are the measured values of Z at the point of i and i + h, respectively, with a separation distance h, and n(h) is the number of pairs of sample points grouped with similar separation distance.
The semivariogram modelling is based on the fitting of parametric semivariogram models to the sample semivariogram models to ensure the unbiased results. The most common semivariogram models employed to describe the spatial variability of the variables are linear, spherical, exponential, and Gaussian models. The definition of the parameters (nugget, sill and range) for models characterization is illustrated in Fig. 2 and the expression of semivariogram models are described in Table 3.
3.6.1 Cross validation
Cross validation is an application used to compare the estimated kriged values obtained from various semivariogram models with the actual measured values. This method uses a quantification of errors based on the semivariogram models associated with kriging application. The predicted value for a selected station is obtained by discarding the corresponding measured value from the whole dataset temporarily and calculating the particular prediction result based on the remaining dataset using kriging method. The errors produced are analysed using five summary statistics as stated below.
Mean error (ME) is used to calculate the average different between the measured values and the predicted values. The best fitting model is chosen based on the ME values that are closer to 0. The expression of ME is as follows:
Root mean square error (RMSE) is applied to indicate the accuracy of the certain model in predicting the measured values. Hence, the minimum error obtained will contribute to a more accurate model. The RMSE is defined as:
Average standard error (ASE) is the predicted standard error which describes the standard error related to the estimated results. The ASE is described as:
Mean standardised error (MSE) is the standard error based on the mean prediction error over the prediction standard deviation. The MSE values should be closer to ASE values for a better model. The definition of MSE is shown as:
Root mean square standardised error (RMSSE) is the prediction standard errors where the result closer to 1 will be the better fit model. The variability in predictions are underestimated if RMSSE is greater than 1 and overestimated if RMSSE is smaller than 1. The interpretation of RMSSE is expressed as:
where \( \widehat{Z}\left( {{s_i}} \right) \) is the predicted value of variable Z at the point s i , Z(s i ) is the measured value at the point s i , \( {{\widehat{\sigma}}^2}\left( {{s_i}} \right) \) is the variance of estimated data, \( \widehat{Z}\left( {{s_i}} \right) \), \( \widehat{\sigma}\left( {{s_i}} \right) \) is the standard deviation of estimated data, \( \widehat{Z}\left( {{s_i}} \right) \) and n is the number of measured values in the dataset.
4 Results and discussions
4.1 Parameter estimation
The parameters for each distribution are estimated using MLE and the results are summarised in Tables 4, 5, 6 and 7 based on the northwest, east, southwest and west regions, respectively. These parameters will be applied to GOF tests in order to determine the best fitting distribution for SPI computation.
4.2 Goodness-of-fit tests
The application of the three quantitative GOF tests discussed above state that the best-fitted distribution is selected based on the minimum error produced, which satisfies the corresponding criteria. Lognormal distribution was found to have the most minimum errors in all criteria of the GOF tests used in this study for the entire stations. Hence, the lognormal distribution is the most appropriate distribution to represent the daily rainfall amount in Peninsular Malaysia and the SPI calculation is based on the lognormal distribution. The summary of the GOF tests regarding to the northwest, east, southwest and west regions can be referred in Tables 8, 9, 10 and 11, respectively.
4.3 Application of Standardised Precipitation Index
The SPI calculation is based on the lognormal expression as the lognormal has been determined as the most appropriate distribution to represent the rainfall pattern. The SPI values obtained are classified into seven categories: extremely wet, severely wet, moderately wet, near normal, extremely dry, severely dry and moderately dry. The percentage of dry and wet events for each category is then calculated with the following formula:
where m is the number of days in each SPI category and n is the total number of days.
Table 12 shows the percentage of descriptive statistics of occurrences for dry and wet events with respect to each category. These results indicate that the average percentages of events ranging from extremely wet to extremely dry are distributed near normal with a higher average in the extremely wet as compared with the extremely dry percentages. This is true, since Peninsular Malaysia is a region where rainfall is abundant and received throughout the year. Nevertheless, drought is also a phenomenon that needs attention in some parts of the region. During El Nino/Southern Oscillation, an event that happened between 1997 and 1998, Malaysia experienced low levels of rainfall that lead to drought conditions, with some regions suffering water disruptions from April to September 1998. Peninsular Malaysia also experienced a long dry spell in 2005 (NRE 2007). From the data, the total average percentage of wet events (EW, SW and MW) is 17.43 %, which is less than the total average percentage of dry events (ED, SD and MD), which is at 18.36 %. Therefore, the results clearly emphasise that the dry events are equally as evident and significant as the wet events, even though most of the regions receive rainfall all year round.
4.4 Cross validation of kriging interpolation
Tables 13 and 14 interpret the results of cross validation for kriging interpolation based on linear, spherical, exponential and Gaussian models. The best-fitted model is determined due to the near zero values of ME, smaller values of RMSE, closer values between ASE and MSE, and near 1 values of RMSSE. Based on the results, spherical semivariogram model is selected to describe the kriging interpolation for both of the dry and wet events. Hence, the kriging prediction for trend test proceeded based on the spherical semivariogram model.
4.5 Trend test
Figure 3 shows the Z values for drought occurrences that have been categorised into moderately, severely and extremely dry as defined by the SPI. The darker colour represents the more positive trend while the lighter colour implies the more negative trend. It can be observed that the eastern and western regions of Peninsular Malaysia are dominated by increasing trends, the majority of which have values that are significant at a more than 95 % confidence level. Significantly, the northwest and southwest regions are occupied almost equally by both the increasing and decreasing trends at a more than 95 % confidence level, with a slight propensity toward downward trend domination. The results imply that a large part of eastern and western regions are expected to have lower precipitation during drought episodes and even drier dry events. The northwest and southwest regions, on the other hand, are expected to have more drought occurrences but not as severe as those in the eastern and western regions. Nevertheless, the drought occurrences in the northwest and southwest regions are still significant, since there are still stations that have significant upward trends.
Figure 4 shows the Z values for wet seasons that are summarised for moderately, severely and extremely wet conditions as defined by the SPI. The darker colour indicates the more positive trend while the lighter colour represents the more negative trend. Significantly, the results demonstrate that the northwest and southwest regions are characterised by decreasing trends at a more than 95 % confidence level, while the eastern and western regions are dominated by increasing trends at more than 95 % confidence level. This means that the east and west regions are expected to experience heavy rainfall during the wet periods, which may cause flooding at certain areas in the region. On the other hand, the northwest and southwest regions are expected to have a decrease in wet events.
In order to describe clearly the trends of dry and wet events in Peninsular Malaysia, the annual trend of a randomly selected station at each region is plotted to represent the trend behaviour for the corresponding region for further justification.
Figures 5, 6, 7 and 8 indicate the time-series plots for the annual trend of the northwest, east, southwest and west regions. The results imply an obvious increasing trend for the drought events and a significant decreasing trend for the wet events for northwest and southwest regions. The trends generally show a significant upward or positive trend for the drought events, which are predicted to increase with time and these regions are expected to receive less precipitation during these events. The wet event exhibits a decreasing trend that may be interpreted as having lesser rainfall amount during this event. Hence, the northwest and southwest regions are predicted to have a higher probability of drought occurrence during dry events and not much rain during wet events.
The time-series plots also indicate that the eastern and western regions are experiencing a significant increase in both dry and wet events. These results imply that the east and west regions are going through a high percentage of significant upward trends that may be translated into the expectation of receiving lower rainfall during drought episodes and heavy rainfall during the wet events.
Therefore, from the results of the annual trend, the eastern and western regions are expected to experience an upward trend during the drought events and also an increasing trend during the wet events. However, for regions that are going through a downward trend, there is also the potential for either drought or flood threats in those regions, as these two extreme events are expected to happen anywhere in Malaysia.
5 Conclusions
The categorisation of rainfall events is important in order to predict and thus prevent meteorological disasters. Based on the fitting distribution, lognormal distribution is recognised as the best-fitted distribution to represent the daily rainfall in Peninsular Malaysia. The SPI results suggest that there is a significant upward trend in daily precipitation, especially for the east and west regions during drought episodes. On the other hand, for the wet events, the SPI showed a significant downward trend, except in the eastern and western parts. The drought occurrences experienced a statistically significant upward trend, with the possibility of the existence of an increasing pattern. This indicates that less precipitation is received during the dry events (that affects most of Peninsular Malaysia), and more precipitation is received during the wet events (in certain areas), with these patterns expected to increase over time. The time-series plots confirm that the whole Peninsular Malaysia is predicted to have an increasing trend during drought events while the majority of its regions are expected to experience a decreasing trend for the wet events, except in east and west parts. These results suggest that certain regions in Peninsular Malaysia are going through drier dry events and wetter wet events, especially the eastern and western regions. These would increase the possibilities of having drought and flood events in Malaysia. Although these two events cannot be prevented and may negatively affect society, the loss can be reduced through mitigation and planning. The results of this study could offer some information on the regions that require attention because these disasters illustrate the vulnerability of economic, social, political and environmental systems to a variable climate.
References
Bacanli UG, Dikbas F, Baran T (2008) Drought analysis and a sample study of Aegean Region. Sixth International Conference on Ethics and Environmental Policies, Padova, 23–25 October
Bordi I, Fraedrich K, Sutera A (2009) Observed drought and wetness trends in Europe: an update. Hydrol Earth Syst Sci 13:1519–1530
Deni SM, Suhaila J, Zin WZW, Jemain AA (2009) Trends of wet spells over Peninsular Malaysia during monsoon seasons. Sains Malaysiana 38(2):133–142
Department of Defense (2011) Trends and implications of climate change for national and international security. Office of the Under Secretary of Defense, Washington
Hamed KH (2009) Exact distribution of the Mann–Kendall trend test statistic for persistent data. J Hydrol 365:86–94
Kendall MG (1975) Rank correlation methods. Griffin, London
Khan S, Gabriel HF, Rana T (2008) Standard precipitation index to track drought and assess impact of rainfall on watertables in irrigation areas. Irrig Drain Syst 22:159–177
Mann HB (1945) Nonparametric tests against trend. Econometrica 13:245–259
McKee TB, Doesken NJ, Kleist J (1993) The relationship of drought frequency and duration on time scale. Preprints, Eighth Conf on Applied Climatology, Anaheim, CA. Am Meteor Soc, Boston pp. 179–184
Merino GG, Jones D, Stooksbury DE, Hubbard KG (2001) Determination of semivariogram models to krige hourly and daily solar irradiance in Western Nebraska. J Appl Meteorol 40:1085–1094
Ministry of Natural Resources & Environment (NRE) (2007) Flood and drought management in Malaysia. Deraf Teks Ucapan Jabatan Pengairan & Saliran Malaysia
Morid S, Smakhtin V, Moghaddasi M (2006) Comparison of seven meteorological indices for drought monitoring in Iran. Int J Climatol 26:971–985
Sirdas S, Sen Z (2001) Application of the standardized precipitation index (SPI) to the Marmara Region, Turkey. Integrated Water Resources Management. IAHS Publ. no. 272.2001
Sonmez FK, Komusu AU, Erkan A, Turgu E (2005) An analysis of spatial and temporal dimension of drought vulnerability in Turkey using the standardized precipitation index. Nat Hazards 35:243–264
Suhaila J, Jemain AA (2009) Investigating the impacts of adjoining wet days on the distribution of daily rainfall amounts in Peninsular Malaysia. J Hydrol 368:17–25
Yusof F, Hui-Mean F (2012) Use of statistical distribution for drought analysis. Appl Math Sci 6(21):1031–1051
Zhang Q, Xu CY, Zhang Z (2009) Observed changes of drought/wetness episodes in the Pearl River Basin, China, using the standardized precipitation index and aridity index. Theor Appl Climatol 98:89–99
Acknowledgments
The authors are grateful to the Malaysian Meteorological Department and Malaysian Drainage and Irrigation Department for providing the daily precipitation data. The work is financed by the MyPhD Scholarship, provided by the Ministry of Higher Education of Malaysia and Universiti Teknologi Malaysia.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yusof, F., Hui-Mean, F., Suhaila, J. et al. Rainfall characterisation by application of standardised precipitation index (SPI) in Peninsular Malaysia. Theor Appl Climatol 115, 503–516 (2014). https://doi.org/10.1007/s00704-013-0918-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00704-013-0918-9