Evaluation of Stochastic Daily Rainfall Data Generation Models

Jaafar, J.; Baki, A.; Abu Bakar, I. A.; Tahir, W.; Awang, H.; Ismail, F.

doi:10.1007/978-981-10-0500-8_17

J. Jaafar⁶,
A. Baki^7,8,
I. A. Abu Bakar⁶,
W. Tahir⁶,
H. Awang⁶ &
…
F. Ismail⁶

632 Accesses
1 Citations

Abstract

In developing countries, data is usually a scarce resource as data collection is an expensive exercise. Therefore, analytical method is required to simulate the actual situations and provide synthetic data for forecasting purposes. This paper will compare several methods of synthetically generating rainfall data based on available data. Several models will be used, including lag-one Markov chain model, two-step model, and transition probability model to generate stochastic daily rainfall data of long-term duration, using data from a catchment in Australia. Three variations of lag-one Markov chain models were used: untransformed, logarithmic transformation, and square root transformation. Two-step model uses Markov chain to model rainfall occurrences and gamma distribution to model rainfall depths. Six variations of the Transition Probability Matrices were used, 3 using Shifted Exponential Distribution and 3 using Box–Cox Power Transformation was adopted to predict the high rainfall depths, and the parameters are determined using maximum-likelihood method on the available rainfall data. The models’ results were tested by comparing the statistics of the generated data against those of the available data. Direct comparisons of the means, standard deviations, and skews show satisfactory results. Further comparisons of monthly means, standard deviations, skews, maxima and minima, as well as the lengths of wet and dry spells had also shown satisfactory results. In conclusion, all the models have produced synthetic rainfall data, which are statistically similar to those of the available data. In comparison, the TPM model gave the most accurate results. Therefore, this model may be utilised for synthetic rainfall data generations, which can then be used for forecasting.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Generation of rainfall data series by using the Markov Chain model in three selected sites in the Kurdistan Region, Iraq

Article Open access 07 June 2023

Markov chain analysis of the rainfall patterns of five geographical locations in the south eastern coast of Ghana

Article Open access 18 August 2017

Modeling of Daily Rainfall Extremes, Using a Semi-Parametric Pareto Tail Approach

Article 25 October 2018

Keywords

1 Introduction

Long-term data is desirable to enable the asset managers to sufficiently simulate the many possibilities, including flooding and extreme droughts. In developing countries, data is usually a scarce resource as data collection is an expensive exercise. Generation of synthetic data is one of the methods to enable forecasting to be made. One of the techniques available to produce the synthetic data is the stochastic data generation. Rainfall is regarded as the most basic weather variable, independent of temperature and evaporation [16]. Hence, generation of long-term synthetic rainfall data can provide basic set of weather variables for long-term forecasting.

The hydrological time series consists of two contributing factors: random factors and persistence (stochastically deterministic factor) [26]. Stochastic modelling used the stochastic properties of observed time series to generate long-term time series. The statistical and stochastic properties of the observed time series are assumed to represent the population properties, and the synthetic long-term time series is assumed to come from the same population [10].

There are many stochastic data generation models. This paper compared several models including lag-one Markov chain model, two-step model, and the transition probability Matrices model (TPM).

Lag-one Markov chain models are the most popular variations of rainfall data generation models [2, 24, 31]. Higher-order Markov chain models have also been utilised satisfactorily [12]. The major problem in daily rainfall generation using a single-step runoff generation type model (Lag-one Markov Chain Model by [2]) is the large number of zero values of daily rainfall. Richardson [22] used square root transformation and a multivariate normal distribution truncated at zero to overcome the zeros problems. Baki [4] used logarithmic and square root transformation to overcome this problem. Nevertheless, there is an inherent problem of large number of zeros in the historical data, which introduced skews. Nevertheless, Malek and Baki [17] successfully forecasted stochastic data for Gombak River in Malaysia using non-transformed data.

The two-step model was developed by various researchers to separate the analysis between the occurrences of rainfall and the rainfall depth. Jones et al. [16] and Adam [1] modelled occurrences of daily rainfall using a Markov chain. The wet spells, which is a series of rainfall occurrences, and the dry spells, which is a series of non-occurrences of rainfall, have also been satisfactorily modelled using Markov chains [19, 21, 25]. Baki [5] has modelled rainfall data generation using the two-step model using Markov chain for rainfall occurrences and gamma distribution for rainfall depth. Generally, Baki [5] has achieved satisfactory results, where the statistics of the generated data is comparable to those of the recorded data.

Haan et al. [14] and Taewechit et al. [29] used a multistate Markov chain approach to model the distribution of rainfall. Haan et al. [14] used seven states to describe rainfall behaviour based on rainfall depths. The first state is dry (no rain), and six others are wet (with rainfall). Uniform distributions were assumed for states 2–6 and a shifted exponential distribution for the seventh state (unbounded).

A modified TPM model was developed by Srikanthan and McMahon [26] based on the TPM model of Haan et al. [14]. The exception was that the daily rainfall data is transformed using the Box–Cox power transformation [8] instead of a shifted exponential distribution for the last class. Srikanthan and McMahon [27] used TPM model in their development of automatic evaluation of stochastically generated rainfall data. Srikhathan et al. [28] also used TPM model in their comparison of daily rainfall data generation models. Baki [6] found that in general, all six variations used (three sets of matrices using shifted exponential and three sets of matrices using Box–Cox power transformation) were equally satisfactory as the differences between the six variations are minimal. This was consistent with the past research as Haan et al. [14] found that the number of classes did not affect the accuracy of the TPM model to a great extent. Therefore, the selection between the six variations is not very critical.

The objective of this paper is to compare the performance of those models in generating daily rainfall. Apart from comparing the daily statistics of the generated data to those of the recorded data, further comparisons will also be carried out using monthly and annual statistics, daily maxima, and average lengths of dry and wet spells. The comparison will enable identification of the model that will give the most accurate statistical comparisons between recorded and generated rainfall statistics.

2 Data and Methods

2.1 Data

The catchment selected for this study is Kangaroo Valley, which is located about 150 km south of Sydney and about 50 km west of the east coast of New South Wales, Australia. The map is shown in Fig. 1, and catchment characteristics are as listed below [3]:

The National Index reference is 215,220.
The catchment area is 330 km².
The length of the stream (Kangaroo River) is 34.5 km.
The average slope of the Kangaroo River is 1.35 % or 135 in 10,000.
The annual rainfall for Kangaroo Valley is 1629.0 mm.
The annual runoff from the catchment is 934.2 mm.
The annual pan evaporation is 1773.4 mm.
The climatic condition for this catchment is temperate.
The vegetation in the area is a mixture of rainforest, hedgeland, sedgeland, and grassland.

A total of 80 years of daily rainfall data were used. Both regionalised and single-site approaches have been satisfactorily used in rainfall data generation. Benson and Matalas [7] used regionalised parameters in stochastic runoff data generation. Solomon [23] used regionalised parameters as he found that regionalised parameters were more suitable than single-site parameters because regionalisation reduced operational bias. Baki [4] found that by using the average rainfall for the catchment, continuity of data could be obtained. Hernáez and Martin-Vide [15], Mehrotra et al. [18], and Camberlin et al. [9] had used regionalised approach to satisfactorily model rainfall data. However, Mhanna and Bauwers [20] had satisfactorily generated rainfall data using single-site approach. In this study, the regionalised approach had been adopted using catchment daily average rainfall. Therefore, the use of catchment average rainfall instead of individual stations allows for better approximations of rainfall stochastic properties and processes.

The location of the catchment is shown in Fig. 1. The locations of the rainfall stations are shown in the enlarged inset of Fig. 1. Catchment average rainfall was computed using the Thiessen polygons [30] of available data for the day. For the day with available data from all rainfall stations, the Thiessen [30] polygons will be computed using 6 rainfall stations (as shown in the inset of Fig. 1). For days that have missing data (e.g. if station 1 data is missing), the Thiessen polygons [30] will be computed using the available data only, namely stations 2, 3, 4, 5, and 6. Similarly, if data from stations 1 and 2 are missing, then the Thiessen polygons [30] will be computed using the available data from stations 3, 4, 5, and 6. There are different polygons for different sets of missing data.

Statistics of daily rainfall for this catchment are shown in Table 1. Table 1 shows that the overall means, standard deviations, skews, and coefficient of variations of daily rainfall for this catchment are 4.4, 15.6, 8.1, and 3.5 mm, respectively. The ratio of skew to coefficient of variation is 2.3, which is close to 2, indicating that gamma distribution can be used to approximate the rainfall distribution [5].

Table 1 Recorded daily rainfall statistics (after [4])

Full size table

Figure 2 shows a plot of serial correlation coefficient (r _k) plotted against the corresponding lag (k). The lag-one value is r ₁ = 0.436, while the other r _k values are less than half r ₁ [4]. Fisher [13] suggested a value of r _k of 0.349 as the conventional minimum value for stochastic analysis of time series. The lag-one serial correlation coefficient (r1) was shown to be satisfactory for this catchment, while the other r _k values are much lower than the suggested conventional minimum value. Lag-one correlation was adopted for this paper [4].

Figure 3 shows the plot of annual rainfall values [4]. No apparent trend can be observed in the values of annual rainfall for this catchment. Therefore, the random variations are assumed to continue in the future. The stochastic rainfall data generation is therefore assumed to be able to reproduce these random variations [4].

2.2 Lag-One Markov Chain Model

Baki [4] used lag-one Markov chain model in modelling daily rainfall. Earlier applications of lag-one Markov chain model were by Adamowski and Smith [2] and Richardson [22].

In the study by Baki [4], the daily recorded rainfall values were standardised as follows:

$$ z_{i} = \frac{{\left( {x_{i} - \overline{{x_{i} }} } \right)}}{{\sigma_{i} }} $$

(1)

where $ z_{i} $ is the standardised daily rainfall (mm) for day i, with zero mean and unit standard deviation; $ x_{i} $ is the daily rainfall (mm) for day i; σ _i is the standard deviation (mm) for day i; is the average daily rainfall (mm) for day i, where i ranges from 1 to 366 (including leap years).

The generated rainfall data is given by:

$$ z_{i} = r_{i} z_{i - 1} + t_{i} \sqrt {\left( {1 - r_{i}^{2} } \right)} $$

(2)

which gives:

$$ x_{i} = \overline{{x_{i} }} + \sigma_{i} \left[ {r_{i} z_{i - 1} + t_{i} \sqrt {\left( {1 - r_{i}^{2} } \right)} } \right] $$

(3)

where $ x_{i} $ is the generated rainfall on day i (mm); $ \overline{{x_{i} }} $ is the mean recorded daily rainfall of day i (mm); $ \sigma_{i} $ is the standard deviation of recorded daily rainfall on day i (mm); $ r_{i} $ is the lag-one serial correlation for the whole record; $ z_{i - 1} $ is the standardised rainfall on day i − 1; and $ t_{i} $ is the normally distributed random numbers with zero mean and unit variances.

Baki [4] used three variations of the lag-one Markov chain model, untransformed data (referred to as QT), logarithmically transformed data (referred to as LOG), and square root transformation (referred to as SQR). All these three results will be used in the comparison.

2.3 Two-Step Model

The large number of zero values of daily rainfall caused problems to single-step runoff generation type of model to generate daily rainfall data. The two-step model was developed to separate the analysis between the occurrence of rainfall and the rainfall depth. Baki [5] used the two-step model by modelling the occurrences of rainfall using transition probabilities between two classes of events (dry days and wet days). The transition probabilities between the two classes are according to Markov chain probabilities.

The gamma distribution can be used to model rainfall depths during wet days. Table 1 shows that the ratio of daily skew coefficients to coefficient of variation (γ/C _v) of the recorded data is 2.3, which is close to 2. Baki [5] adopted the gamma distribution since the data he used had a ratio (γ/C _v) close to 2. This distribution is also utilised by Jones et al. [16] and Carey and Haan [11].

The gamma distribution is given by:

$$ F(\left. x \right\|k) = \int\limits_{o}^{x} {\frac{{\left( {\lambda_{ik} } \right)\eta_{ik} }}{{\Gamma \left( {\eta_{ik} } \right)}}} U^{{\left( {\eta_{ik} = 1} \right)}} \exp \left( { - \lambda_{ik} U} \right){\text{d}}u $$

(4)

where U is the uniformly distributed random number between 0 and 1.

In order to find the parameters, λ and η, maximum likelihood can be used. Carey and Haan [11] used maximum likelihood to find the parameters in their study. For example,

$$ \eta^{\text{ * }} = \frac{{0.5000876 + 0.164852y + 0.0544274y^{2} }}{y} $$

(5)

in which

$$ {{y}} = {\text{In}}\left( {\sum\limits_{i = 1}^{n} {\frac{{v_{i} }}{n}} } \right) - \sum\limits_{i = 1}^{n} {\frac{{\ln v_{i} }}{n}} $$

(6)

v _i = ith observation from a sample of n observations.

Correction for small sample bias can then be made as follows:

$$ \eta = \frac{{(n - 3)\eta^{*} }}{n} $$

(7)

The estimate for λ can then be made:

$$ {\lambda = }\frac{\eta }{{\sum\nolimits_{i = 1}^{n} {\frac{{v_{i} }}{n}} }} $$

(8)

Baki [5] used the two-step model, using a first-order Markov chain to model occurrences of rainfall and a gamma distribution to generate rainfall depths during wet days. The parameters of the gamma distribution will be estimated from the recorded wet days. The results from this study will be used in the comparison (referred to as TS).

2.4 Transition Probability Matrices Model

Haan et al. [14] mentioned that persistence and periodicities can be observed in daily weather patterns. The persistence is modelled by a Markov chain. Consider

$$ P\left( {E_{nj} \left| {E_{{n - 1j_{n - 1, \ldots ,} }} E_{{1j_{1} }} } \right.} \right) = P\left( {E_{nj} \left| {E_{{n - 1j_{n - 1} }} } \right.} \right) $$

(9)

where for $ x_{1} ,x_{2} , \ldots $ as the observations of daily rainfall, then E _i,j (i = 1, 2,…, c, and j = 0, 1, …, c), where c is the number of classes or states, and if P(E _nj|E _n − 1j) does not depend on n, then these transition probabilities (denoted P _ij), and the Markov chain is stationary. The transition probability matrices (TPM) is the collection of P _ij between classes in (c + 1) × (c + 1) matrices.

Periodicities mean that the weather pattern undergoes a cyclical behaviour within a year. Within a season, the weather pattern can be assumed to be stationary. Therefore, the TPM can be assumed to be stationary within each season:

$$ P_{ij}^{(k)} (i.j = 0,1, \ldots ,c)\quad {\text{and}}\quad (k = 1, \ldots ,s) $$

(10)

where k denotes the kth season and s is the total number of seasons.

The probability distributions had to be fitted to each class. It was assumed that the same set of distributions would model each season. Therefore, (c + 1) cumulative distribution functions are used:

$$ F_{m} (x)(m = 0, \ldots ,c) $$

(11)

where F _m (x) = P (rainfall < x | rainfall belongs to class m).

A uniform distribution was assumed for all wet states, except for the last one. For the highest class, a shifted exponential distribution was found by Haan et al. [14] to be the most suitable:

$$ F_{{{\text{last}}(x)}} = \exp \left( {{\raise0.7ex\hbox{${(x - {\text{ncl}})}$} \!\mathord{\left/ {\vphantom {{(x - {\text{ncl}})} \eta }}\right.\kern-0pt} \!\lower0.7ex\hbox{$\eta $}}} \right) $$

(12)

where ncl is the lower boundary of the last class and η is a constant found by maximum likelihood:

$$ \eta = \bar{x} - {\text{ncl}} $$

(13)

where $ \bar{x} $ is the mean daily rainfall greater than ncl.

Haan et al. [14] adopted the months to be the seasons. Seasons follow an annual cycle, and by using months to represent seasons, the cyclical pattern can be satisfactorily represented. Hence, the TPM can be assumed to be stationary within a month. They also adopted 7 classes of daily rainfall states after testing up to 12 classes. These values were found to be satisfactory for the Kentucky basin. Therefore, twelve sets of (7 × 7) matrices needed to be found from the recorded data.

Baki [6] tested six variations of the TPM model: 6 × 6 TPM (called SE6), 7 × 7 TPM (called SE7), and 8 × 8 TPM (called SE8), all three with shifted exponential distribution for the last class and linear distribution for the other classes, and 6 × 6 TPM (called BC6), 7 × 7 TPM (called BC7) and 8 × 8 TPM (called BC8), all three with Box–Cox power transformation for the last class and linear distribution for the other classes. The last (highest) class has closed lower bound and open upper bound. The class boundaries are shown in Table 2. The results from Baki [6]’s study will be used in the comparison.

Table 2 Class boundaries for TPM model

Full size table

3 Results and Discussion

Ten replicates of generated data were made, each with the same length as the recorded data. The average of the statistical measures of the generated data was compared to those of the recorded data.

Table 3 shows the average of ten replicates of daily means of all models used compared to the recorded data. In general, the statistical measures of daily means of the generated data for all models are satisfactory except for untransformed lag-one Markov chain model (QT). In comparison, all the six variations of the TPM and TS models are more accurate compared to those of the lag-one Markov chain model, in respect to the daily means of the recorded data. In terms of accuracy of the daily means, 7 × 7 TPM (SE7 with 8 accurate daily means followed by BC7 with 7) gave the best results compared to others (as highlighted in Table 3).

Table 3 Mean daily rainfall statistics’ comparison

Full size table

Table 4 shows the average of ten replicates of daily standard deviations compared to the recorded data. Again, the statistics of the generated data for all six variations of the TPM and TS models are more accurate compared to those of the lag-one Markov chain model. In terms of accuracy, the standard deviations for 6 × 6 and 7 × 7 TPM (SE6, SE7, BC6, BC7) tend to be lower, indicating that the data generated by the model tend to be more normally distributed, while 8 × 8 TPM (SE8 and BC8) can generate data that are less normally distributed compared to the recorded data as some of the standard deviations exceeded those of the recorded data. Furthermore, SE8 and BC8 both have 4 accurate daily standard deviations, which are much better than others (SE7 and BC6 both have 2, as highlighted in Table 4).

Table 4 Daily standard deviations’ comparison

Full size table

Table 5 shows the average of ten replicates of daily skews compared to the recorded data. Once again, the statistics of the generated data for all six variations of the TPM and TS models are satisfactory compared to those of the lag-one Markov chain model, in comparison with the recorded data. In terms of accuracy, the skews for 6 × 6 and 7 × 7 TPM (SE6, SE7, BC6, BC7) tend to be lower, indicating that the data generated by the model tend to be more normally distributed. The 8 × 8 TPM (SE8 and BC8) can generate data that are less normally distributed compared to the recorded data, since some of the skews exceeded those of the recorded data. SE8 has 6 accurate daily skews, followed by BC8 with 4 (as highlighted in Table 5).

Table 5 Daily skews’ comparison

Full size table

By comparing the daily statistics (means, standard deviations, and skews), the TPM models gave the most accurate results compared to the two-step (TS) model and both TPM and TS are more accurate than the lag-one Markov chain models (QT, LOG, and SQR), especially the untransformed (QT). Within the TPM models, the 7 × 7 TPM (both SE7 and BC7) gave the best estimates of daily means, but the 8 × 8 TPM (SE8 and BC8) gave best estimates of daily standard deviations and daily skews. However, the differences between the variations (SE6, SE7, SE8, BC6, BC7, and BC8) are not significant. In general, all six variations were equally satisfactory as the differences between the six variations are minimal. Thus, the findings of Baki [6] were consistent with the past research as Haan et al. [14] found that the number of classes did not affect the accuracy of the TPM model to a great extent. Therefore, the selection between the six variations is not very critical.

In all daily statistical measures, i.e. means, standard deviations, and skews, Tables III, IV, and V show that the trend of the figures given by the TPM model (SE6, SE7, SE8, BC6, BC7, and BC8) follows the trend of the recorded data better than the other models (TS and lag-one Markov). In overall considerations, the TPM is proven to be the most satisfactory model. This finding is consistent with other researches, such as by Srikanthan et al. [28].

Apart from comparing the daily statistical measures (as carried out by [4–6]), other measures were also necessary to be compared. As discussed above, selection between the TPM variations is not critical, and thus, SE8 and BC8 are adopted for further comparison. Since TS has no variations, it is also adopted for further comparison. For the three variations of the lag-one Markov chain model, Baki [4] found that LOG was the most satisfactory variation, and thus, it is adopted for further comparison. Hence, further comparisons were made between SE8, BC8, TS, and LOG.

Figure 4 shows the comparison of daily maxima between recorded data and 4 adopted models. For daily maxima, SE8 was found to be most satisfactorily as it is capable of generated daily maxima greater than the recorded maximum daily rainfall of 423.5 mm. BC8 and TS were also satisfactory in generating the trend, but they tend to have values slightly lower than the recorded maximum. Nevertheless, SE8, BC8, and TS are satisfactory in generating similar trend of daily maxima to the recorded data, hence satisfactory in generating extreme rainfall events. LOG seems to be overestimating the occurrences of daily maxima, with the model generating daily maxima with higher magnitude at higher frequencies compared to other models and also compared to the recorded data.

Figures 5, 6, 7, and 8 show the comparison of monthly statistics. The daily data (recorded and generated) were accumulated on monthly basis, and statistical comparisons were made between the cumulative monthly figures. Figure 5 shows that for monthly means, SE8, BC8, and TS were most satisfactory in generating monthly means. Figure 6 shows that SE8 and BC8 were most satisfactory in generating monthly standard deviations, followed by LOG, as TS tends to underestimate the monthly standard deviations. Figure 7 shows that for monthly maxima, SE8, BC8, and LOG were most satisfactory, while TS tends to generate lower maxima. Figure 8 shows that for monthly minima, SE8, BC8, and LOG were most satisfactory, while TS tends to generate higher minima during the first quarter. Thus, TS is not able to generate extreme rainfall or drought events.

Table 6 shows the comparison of annual statistics. For annual statistics, SE8 and BC8 are satisfactory in generating annual means, standard deviations, skews, and maxima and minima. TS is only satisfactory in generating annual means, but tends to underestimate the standard deviations, skews, and maxima and overestimate the minima. Thus, TS is unable to reproduce the variations in the recorded data. LOG generated data with lower annual means (14.4 % lower than the annual recorded rainfall), satisfactory standard deviations, skews, and maxima and minima. LOG had the tendency to underestimate the annual rainfall figures.

Table 6 Annual statistical comparison

Full size table

Figure 9 shows the comparison of average length of wet spells. In terms of sequences of rainfall events, all models were generally satisfactory in reproducing the average lengths of wet spells. Figure 10 shows the comparison of average length of dry spells. LOG tends to underestimate the average lengths of dry spells, while the other three models (SE8, BC8, and TS) are satisfactory. Thus, LOG is unable to model drought events satisfactorily.

After further comparisons were made, findings are consistent with the daily statistical comparison (Tables 3, 4, and 5). It is also indicated that the TPM is the most satisfactory model. This finding is consistent with earlier discussions on Tables 3, 4, and 5 and also with other researches, such as by Srikanthan et al. [28]. Thus, TPM can be used to generate stochastic daily rainfall data, which will give synthetic data that is statistically similar to the recorded data.

4 Conclusions

In conclusion, except for QT, all the other models have produced synthetic rainfall data, which are statistically similar to those of the available data. The data generated have similar stochastic properties compared to the recorded data, and statistically, it can be deduced that both samples (recorded and generated sets) come from the same statistical population.

In comparison, the most accurate model is the TPM model for this particular case. It is able to generate data with the closest statistical measures to those of the recorded data. As the data in this case is shown to be persistent over the whole 80-year period, the model can be assumed to be able to forecast the variation in rainfall data. Therefore, this model may be utilised for synthetic rainfall data generations. These synthetic data can then assist in giving possible variations of rainfall over longer period, which would be useful for forecasting.

References

Adam RY (2012) Stochastic model for rainfall occurrence using markov chain model. PhD Thesis, Sudan University of Science and Technology, Khartoum, Sudan, unpublished
Google Scholar
Adamowski K, Smith AK (1972) Stochastic generation of rainfall. J Hydraul Div Am Soc Civil Eng 98(HY11):1935–1945
Google Scholar
Baki ABM (1996) Objective functions in the optimisation of daily rainfall-runoff modelling. JURUTERA: Mon Bull Inst Eng Malays 9:11–15 (September)
Google Scholar
Baki ABM (1997) Stochastic rainfall data generation using lag-one markov chain model. J Inst Eng Malays 58(3):55–61
Google Scholar
Baki ABM (2002) Stochastic rainfall data generation using two-step markov chain model: a case study. In: Proceedings of the 20th conference of ASEAN federation of engineering organisations (CAFEO20), Phnom Penh, Cambodia, vol 1, pp 85–92, 2–4 Sept 2002
Google Scholar
Baki ABM (2005) Stochastic rainfall data generation using transition probability model. In: Proceedings of the seventh annual IEM water resources colloquium 2005, The Institution of Engineers Malaysia, Petaling Jaya, Malaysia, pp 9-1–9-9, 18 June 2005
Google Scholar
Benson MA, Matalas NC (1967) Synthetic hydrology based on regional statistical parameters. Water Resour Res 3(4):931–935
Article Google Scholar
Box GEP, Cox OR (1964) The analysis of transformations. J Roy Stat Soc B 26(2):211–252
Google Scholar
Camberlin P, Gitau W, Oettli P, Ogallo L, Bois B (2014) Spatial interpolation of daily rainfall stochastic generation parameters over East Africa. Clim Res 59(1):39–60
Article Google Scholar
Campo MA, Lopez JJ, Rebole JP (2012) Rainfall stochastic models, EGU general assembly 2012, held 22–27 Apr 2012 in Vienna, Austria, p 13458
Google Scholar
Carey DI, Haan CT (1978) Markov processes for simulating daily point rainfall. J Irrig Drai Div Am Soc Civil Eng 104(IR1):111–125
Google Scholar
Dartidar AG, Gosh D, Dasgupta S, De UK (2010) Higher order markov chain model for monsoon rain over West Bengal, India. Ind J Radio Space Phys 39:39–44 (Febraury)
Google Scholar
Fisher RA (1958) Statistical methods for research workers, 13th edn. Oliver & Boyd, London
Google Scholar
Haan CT, Allen DM, Street JO (1976) A Markov chain model of daily rainfall. Water Resour Res 12(3):443–449
Article Google Scholar
Hernáez PF-A, Martin-Vide J (2011) Regionalization of the probability of wet spells and rainfall persistence in the Basque Country (Northern Spain). Int J Climatol 32(Issue 12): 1909–1920 (October 2012)
Google Scholar
Jones JW, Colwick RD, Threadgill ED (1972) A simulated environmental model of temperature. Evaporation Rainfall Soil Moisture Trans Am Soc Agric Eng 15(2):366–372
Article Google Scholar
Malek MA, Baki AM (2014) Forecasting of hydrological time series data with lag-one markov chain model. ASEAN J Sci Technol Dev 31(1): 31–37. ISSN: 0217-5460
Google Scholar
Mehrotra R, Westra SP, Sharma A, Srikanthan R (2012) Continuous rainfall simulation: 2 a regionalized daily rainfall generation approach. Water Resour Res 48:W01536
Article Google Scholar
Meshram S, Bisen Y, Kant S, Singh G, Nema AK (2013) Markov chain model probability of dry wet weeks and statistical analysis of weekly rainfall for agricultural planning at Jabalpur. J Environ Ecol 31(3):1250–1254
Google Scholar
Mhanna M, Bauwens W (2011) Stochastic single-site generation of daily and monthly rainfall in the Middle East. Meteorol Appl 19(1):111–117 (March 2012)
Google Scholar
Nema AK, Bisen Y, Singh SR, Singh T (2013) Markov chain approach—dry and wet spell rainfall probabilities in planning rainfed rice based production system. Ind J Dryland Agric Res Dev 28(2):16–20
Google Scholar
Richardson C (1978) Generation of daily precipitation over an area. Water Resour Bull 1(5):1035–1047
Article Google Scholar
Solomon S (1976) Parameter regionalisation and network design. In: Shen HW (ed) Stochastic approaches to water resources. Colorado State University Press, Fort Collins, pp 12.1–12.37
Google Scholar
Sonnadara DUJ (2012) Modeling daily rainfall using markov chains, annual research sympsium 2012. University of Colombo, Sri Lanka
Google Scholar
Sonnadara DUJ, Jayawardene DR (2014) A Markov chain probability model to describe wet and dry patterns of weather at Colombo. Theor. Appl Climatol, February 2014. doi:10.1007/s00704-014-1117-z (February)
Srikanthan R, McMahon TA (1983) Stochastic simulation of daily rainfall for australian stations. Trans Am Soc Agric Eng 26:754–759
Article Google Scholar
Srikanthan R, McMahon TA (2005) Automatic evaluation of stochastically generated rainfall data. Aust J Water Resour 8(2):195–201
Google Scholar
Srikanthan R, Siriwardena L, McMahon TA (2005) Comparison of two daily rainfall data generation models. Aust J Water Resour 8(2):203–212
Google Scholar
Taewechit S, Soni P, Salokhe VM, Jayasuriya HPW (2011) Optimal stochastic multi-states first-order Markov chain parameters for synthesizing daily rainfall data using multi-objective differential evolution in Thailand. Meteorol Appl 20(1): 20–31, March 2013 (March 2013)
Google Scholar
Thiessen AH (1911) Precipitation averages for large areas. Mon Weather Rev 1082 pp
Google Scholar
Yusuf AU, Adamu L, Abdullahi M (2014) Markov chain model and its application to annual rainfall distribution for crop production. Am J Theor Appl Stat 3(2):39–43
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Civil Engineering, Universiti Teknologi MARA, Shah Alam, Malaysia
J. Jaafar, I. A. Abu Bakar, W. Tahir, H. Awang & F. Ismail
Faculty of Engineering, Al-Madinah International University, Shah Alam, Malaysia
A. Baki
Envirab Services, P.O. Box 7866, Shah Alam, 40730, Malaysia
A. Baki

Authors

J. Jaafar
View author publications
You can also search for this author in PubMed Google Scholar
A. Baki
View author publications
You can also search for this author in PubMed Google Scholar
I. A. Abu Bakar
View author publications
You can also search for this author in PubMed Google Scholar
W. Tahir
View author publications
You can also search for this author in PubMed Google Scholar
H. Awang
View author publications
You can also search for this author in PubMed Google Scholar
F. Ismail
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. Jaafar .

Editor information

Editors and Affiliations

Faculty of Civil Engineering, Universiti Teknologi MARA, Shah Alam Selangor, Malaysia
Wardah Tahir
Universiti Teknologi MARA, Selangor, Malaysia
Prof Ir Dr Sahol Hamid Abu Bakar
Universiti Teknologi MARA, Selangor, Malaysia
Marfiah Ab. Wahid
Universiti Teknologi MARA, Selangor, Malaysia
Siti Rashidah Mohd Nasir
Universiti Teknologi MARA, Selangor, Malaysia
Wei Koon Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jaafar, J., Baki, A., Abu Bakar, I.A., Tahir, W., Awang, H., Ismail, F. (2016). Evaluation of Stochastic Daily Rainfall Data Generation Models. In: Tahir, W., Abu Bakar, P., Wahid, M., Mohd Nasir, S., Lee, W. (eds) ISFRAM 2015. Springer, Singapore. https://doi.org/10.1007/978-981-10-0500-8_17

Download citation

DOI: https://doi.org/10.1007/978-981-10-0500-8_17
Published: 10 April 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0499-5
Online ISBN: 978-981-10-0500-8
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us

Policies and ethics

Evaluation of Stochastic Daily Rainfall Data Generation Models

Abstract

Similar content being viewed by others

Generation of rainfall data series by using the Markov Chain model in three selected sites in the Kurdistan Region, Iraq

Markov chain analysis of the rainfall patterns of five geographical locations in the south eastern coast of Ghana

Modeling of Daily Rainfall Extremes, Using a Semi-Parametric Pareto Tail Approach

Keywords

1 Introduction