1 Introduction

Stochastic modeling and simulation of daily rainfall has been a prominent subject in hydrology and water resources for several decades. The simulation of daily rainfall can be used for agricultural operations, for the design of irrigation systems, and as an input for rainfall-runoff studies. These simulation models have also been employed in climate change studies (Lee et al. 2012; Mezghani and Hingray 2009). An important criterion for stochastic modeling is reproducibility of the statistical characteristics of observed data. Target characteristics for daily rainfall would be the occurrence and amount processes. The scaling behavior and over-dispersion (tendency to underestimate the observed variance of larger time scale data) of generated data have also been considered (Burlando and Rosso 1996; Katz and Zheng 1999).

A number of models have been developed to explain these processes. Traditionally, stochastic modeling of precipitation separates daily rainfall into two processes to account for intermittency.

The occurrence process was first modeled by Gabriel and Neumann (1962) using a Markov Chain (MC) for Tel Aviv daily rainfall data. The MC model assumes that the probability of rainfall for a certain day depends on that for the previous day. The MC model has been further developed in several studies (e.g., Dennett et al. 1983; Guttorp and Minin 1995; Yoon et al. 2013; Berne et al. 2004).

Meanwhile, Burlando and Rosso (1991) applied the discrete autoregressive moving average (DARMA) model, which was originally formulated by Eagleson (1978), to the occurrence process of precipitation data. An alternative model for the occurrence process is renewal, defined as a sequence of alternating wet and dry intervals (Hingray and Ben Haha 2005; Roldan and Woolhiser 1982). Several other models have been developed to explain the occurrence process, such as the non-homogeneous hidden Markov model (Hughes et al. 1999). Lall et al. (1996) used a nonparametric approach for wet- and dry-spell length that is similar to the renewal model but uses a discrete fernel estimator.

Several probability distributions have been used to model daily rainfall amount. Rainfall amount is so highly skewed that an appropriate distribution must be chosen, or the data should be transformed to fit Gaussian-based time series models. Todorovic andf Woolhiser (1975) chose the exponential distribution to describe the amount process; this distribution has been applied by other authors (Richardson 1981; Wilby 1994). A two-parameter gamma distribution, with scale and shape parameters, has also been studied (e.g., Eagleson 1978; Katz 1996; Koutsoyiannis and Onof 2001). An alternative has been the use of a mixture of two single-parameter exponential distributions (Lebel et al. 1998; Wilks 1999; Woolhiser and Roldan 1982).

Transformations have been applied to skewed data for normalization, and the autoregressive moving average (ARMA) has been used in the study of successive rainfall amounts (Katz and Parlange 1993, 1995). Katz (1999) experimented with the power transformation and derived a direct relationship between the original and transformed data for moments and autocorrelation functions. This study determined that the power transformation stabilizes the variance and amplifies the autocorrelation of the amount of consecutive rainy days. Hannachi (2012) applied the autoregressive-1 (AR-1) model to daily precipitation from the Northern Ireland Armagh Observatory. Aronica and Bonaccorso (2013) applied a first-order Markov Chain and ARMA model to qualitatively assess the impact of climate change on the hydrological regime of the Alcantara River Basin in Italy.

Time-dependent models for precipitation data are not favored due to model complexity. In the current study, we normalized the data on the amount of precipitation with a power transformation to test a time-dependent AR model. We applied a simple Markov Chain to precipitation occurrence (X process). We used a novel method to estimate AR model parameters using the relationship between original and transformed moments through the moment-generating function. Finally, we compared our proposed method with the traditional parameter estimation method from the transformed domain.

This paper is organized as follows: we present the mathematical description in Section 2, followed by data description and application methodology in Section 3. The results are shown in Section 4. Finally, the summary and conclusions are presented in Section 5.

2 Mathematical description

The model we applied to describe the daily rainfall process had the following form:

$$ Y=X\cdot Z $$
(1)

where Y represents a positive intermittent variable of the daily rainfall process, X is the occurrence variable, and Z is the amount variable. X and Z are assumed to be independent and periodic on a monthly basis.

2.1 Modeling the amount of precipitation

We applied a simple first-order AR-1 model because daily precipitation data do not present long-term memory and involve intermittency (Hannachi 2012). We tested two-parameter estimation methods, the traditional direct method (M1) and the indirect method (M2), each month to determine how well key statistics were preserved on daily, monthly, and yearly scales.

2.1.1 AR-1 model with direct parameter estimation (M1)

Katz and Parlange (1995) applied an ARMA model to account for the time dependency of precipitation events; because rainfall amount is highly positively skewed, these data are normalized. Power and log transformations are generally used. Katz (1999) applied a power transformation to daily rainfall amount. This transformation enables choosing of the most appropriate power value to match the magnitude of skewness.

The following equations describe the AR-1 model with the power transformation for the daily rainfall variable at time t (Z t ):

$$ {N}_t={Z_t}^{1/C} $$
(2)
$$ {N}_t={\mu}_N+{\phi_1}_{,N}\left({N}_{t-1}-{\mu}_N\right)+{\varepsilon}_t $$
(3)

where μ N , σ ε , and ϕ 1,N are the model parameters indicating the mean, standard deviation of the random component, and the first-order autocorrelation coefficient, respectively, for the transformed domain variable N. ε t represents the white noise or error term with a normal distribution N(0, σ 2 ε ), where σ 2 ε is the unexplained variance from the autoregressive model in Eq. (3). Additionally, c represents the power exponent, for which only an integer number is generally used (i.e., c = 1,2,3,…).

Note that the transformed variable N t is assumed to be normal with the condition that precipitation occurs (i.e., X t  = 1). Therefore, the parameters are also conditional. For example, the mean μ N is E(N t |X t  = 1) rather than E(N t ). The condition was not included in the notation for simplification in the current work.

The model parameters have been estimated through the method of moments, which uses the following relationship between model moments and calculated moments from observed data (Salas 1993):

$$ {\widehat{\mu}}_N=\frac{1}{n}{\displaystyle {\sum}_{t=1}^n{N}_t} $$
(4)
$$ {\widehat{\sigma}}_N^2=\frac{1}{n-1}{{\displaystyle {\sum}_{t=1}^n\left({N}_t-{\widehat{\mu}}_N\right)}}^2 $$
(5)
$$ {\widehat{\phi}}_{1,N}={\widehat{\rho}}_{1,N}=\frac{\frac{1}{n-1}{\displaystyle {\sum}_{t=2}^n\left({N}_t-{\mu}_N\right)\left({N}_{t-1}-{\mu}_N\right)}}{\widehat{\sigma}{N}^2} $$
(6)
$$ {\widehat{\sigma}}_{\varepsilon}^2=\left(1-{\widehat{\phi}}_{1,N}^2\right){\widehat{\sigma}}_N^2 $$
(7)

Note that model parameters were estimated each month; i.e., a different parameterization was applied each month.

Parameters estimated from Eqs. (4)–(7) are used to generate N t in the simulation. The generated N t data should be back-transformed to obtain Z t  = N c t . Finally, Z t is multiplied by X t to obtain Y t .

2.1.2 AR-1 model with the indirect parameter estimation method (M2)

Bias in key statistics has been reported when using M1 for the AR-1 model. In this section, we describe an indirect method to estimate model parameters from the relationship between the Z and N processes. This relationship is explicitly described by Katz (1999) as follows:

$$ E\left({Z}_t\right)=E\left({N}_t^c\right) $$
(8)
$$ E\left({Z}_t^2\right)=E\left({N}_t^{2c}\right) $$
(9)
$$ E\left({Z}_t{Z}_{t+l}\right)=E\left({N}_t^c{N}_{t+l}^c\right) $$
(10)

where l = 0, 1, 2,… is the time lag. Note that the proposed method is only valid for cases in which the exponent c in Eq. (2) is an integer.

The moment relationship between variables Z and N can be derived from the moment generating function of a bivariate normal distribution with two dummy random variables U 1 and U 2 :

$$ {M}_{U_1{U}_2}\left({\mu}_1,{u}_2\right)= \exp \left[{\mu}_1{u}_1+{\mu}_2{u}_2+\frac{1}{2}\left({\sigma}_1^2{u}_1^2+{\sigma}_2^2{u}_2^2+2\rho {\sigma}_1{\sigma}_2{u}_1{u}_2\right)\right] $$
(11)

The product moments are denoted as follows:

$$ \gamma \left(j,k\right)=E\left({U}_1^j,{U}_2^k\right) $$
(12)

The repeated partial differentiation of Eq. (11) with u 1 = u 2 = 0 yields the following two-dimensional recursion:

$$ \gamma \left(j,0\right)={\mu}_1\gamma \left(j-1,0\right)+\left(j-1\right){\sigma}_1^2\gamma \left(j-2,0\right) $$
(13)
$$ \gamma \left(0,k\right)={\mu}_1\gamma \left(0,k-1\right)+\left(k-1\right){\sigma}_2^2\gamma \left(0,k-2\right) $$
(14)
$$ \gamma \left(1,k\right)={\mu}_2\gamma \left(1,k-1\right)+{\sigma}_1{\sigma}_2\rho \gamma \left(0,k-1\right)+\left(k-1\right){\sigma}_2^2\gamma \left(1,k-2\right) $$
(15)
$$ \gamma \left(j,k\right)={\mu}_1\gamma \left(j-1,k\right)+k{\sigma}_1{\sigma}_2\rho \gamma \left(j-1,k-1\right)+\left(j-1\right){\sigma}_1^2\gamma \left(j-2,k\right) $$
(16)

The initial condition for this recursion is γ(0, 0) = 1, γ(1, 0) = μ 1, γ(0, 1) = μ 2, γ(1, 1) = μ 1 μ 2 + σ 1 σ 2 ρ. Equations (13) and (14) will provide the univariate moments, while Eqs. (15) and (16) will provide the lagged correlation. By substituting the derivatives of Eqs. (13)–(16) into Eqs. (8)–(10), the high moments of the power transformed variable, N, are described as follows:

$$ E\left({Z}_t\right)=E\left({N}_t^c\right)=\gamma \left(c,0\right)={\mu}_N\gamma \left(c-1,0\right)+\left(c-1\right){\sigma}_N^2\gamma \left(c-2,0\right) $$
(17)
$$ E\left({Z}_t^2\right)=E\left({N}_t^{2c}\right)=\gamma \left(2c,0\right)={\mu}_N\gamma \left(2c-1,0\right)+\left(2c-1\right){\sigma}_N^2\gamma \left(2c-2,0\right) $$
(18)
$$ E\left({Z}_t,{Z}_{t+1}\right)=E\left({N}_t^c,{N}_{t+1}^c\right)=\gamma \left(c,c\right)={\mu}_N\gamma \left(c-1,c\right)+c{\sigma}_N^2{\rho}_{1,N}\gamma \left(c-1,c-1\right)+\left(c-1\right)\gamma \left(c-2,c\right) $$
(19)

Note that from the relationship between Z and N, N variable statistics, such as the mean, standard deviation, and autocorrelation, can be estimated from Z variable statistics. To complete the relationship in Eqs. (17)–(19), recursion is required; that is, γ(c − 2, 0) and γ(c − 1, 0) are required for γ(c, 0), γ(c − 3, 0) and γ(c − 2, 0) are required for γ(c − 1, 0), and so on. The moments of the Z variable can be estimated from the arithmetic mean of sampled data with the given formulation.

Instead of using moments, common statistics such as variance and autocorrelation coefficients can be used for the relationship, as follows:

$$ {\sigma}_Z^2=E{Z}^2-{(EZ)}^2=E\left({N}^{2c}\right)-{\left(E\left({N}^c\right)\right)}^2=\gamma \left(2c,0\right)-{\left(\gamma \left(c,0\right)\right)}^2 $$
(20)
$$ {\rho}_{1,z}=\frac{\mathrm{Cov}\left({N}_t^c,{N}_{t+1}^c\right)}{\mathrm{Var}\left({N}_t^c\right)}=\frac{E\left({N}_t^c,{N}_{t+1}^c\right)-{\left(E\left({N}_t^c\right)\right)}^2}{E\left({N}_t^{2c}\right)-{\left(E\left({N}_t^c\right)\right)}^2}=\frac{\gamma \left(c,c\right)-{\left(\gamma \left(c,0\right)\right)}^2}{\gamma \left(2c,0\right)-{\left(\gamma \left(c,0\right)\right)}^2} $$
(21)

where \( {\widehat{\mu}}_N \), \( {\widehat{\sigma}}_N \), and \( {\widehat{\rho}}_N \) are estimated from the moment relationship.

Note that to perform the stochastic simulation of daily rainfall, \( {\widehat{\mu}}_N,{\widehat{\phi}}_{1,N},\mathrm{and}\kern0.5em {\widehat{\sigma}}_{\varepsilon}^2 \) are required. \( {\widehat{\mu}}_N \) is estimated from numerically solving Eqs. (17) and (20) as well as \( {\widehat{\sigma}}_N^2 \). \( {\widehat{\phi}}_{1,N} \) is calculated from Eq. (21). \( {\widehat{\phi}}_{\varepsilon}^2 \) is estimated from \( {\widehat{\sigma}}_N^2 \) and \( {\widehat{\phi}}_{1,N} \) as in Eq. (7). One example of c = 2 is presented in Appendix 1 for further clarification.

2.2 Modeling precipitation occurrence

Several models have been developed to simulate the occurrence process (X) of daily rainfall. Gabriel and Neumann (1962) applied a binary occurrence process for wet and dry days by Discrete ARMA (DARMA(p,q)). The simple DARMA(1,0) is described as follows:

$$ {X}_t={V}_t{X}_{t-1}+\left(1-{V}_t\right){W}_t $$
(22)

where all variables represent binary processes, i.e., W t , V t , and X t  ∈ {0, 1},

$$ \Pr \left[{W}_t=1\right]=\xi \kern0.5em \mathrm{and}\kern0.5em \Pr \left[{V}_t=1\right]=\lambda $$
(23)

Therefore, the process X t can be written as follows:

$$ {X}_t=\left\{\begin{array}{cc}\hfill {X}_{t-1}\hfill & \hfill \mathrm{with}\kern0.5em \mathrm{probability}\kern0.5em \lambda \hfill \\ {}\hfill {W}_t\hfill & \hfill \mathrm{with}\kern0.5em \mathrm{probability}\kern0.5em 1-\lambda \hfill \end{array}\right. $$
(24)

The DARMA(1,0) model is equivalent to the Markov Chain, which has been applied broadly to model the rainfall process since Gabriel and Neumann (1962). The Markov Chain model is expressed with the following transition probability matrix:

$$ P=\left|\begin{array}{cc}\hfill {p}_{00}\hfill & \hfill {p}_{01}\hfill \\ {}\hfill {p}_{10}\hfill & \hfill {p}_{11}\hfill \end{array}\right| $$
(25)

where p ab  = Pr[X t  = b|X t − 1 = a] and a, b ∈ {0, 1}, with the limiting distributions p 0 = Pr[X t  = 0] and p 1 = Pr[X t  = 1].

These two models (DARMA(1,0) and Markov Chain) are equivalent to the following parametric relationship:

$$ \left|\begin{array}{cc}\hfill {p}_{00}\hfill & \hfill {p}_{01}\hfill \\ {}\hfill {p}_{10}\hfill & \hfill {p}_{11}\hfill \end{array}\right|=\left|\begin{array}{cc}\hfill \lambda +\left(1-\lambda \right)\left(1-\xi \right)\hfill & \hfill \left(1-\lambda \right)\xi \hfill \\ {}\hfill \left(1-\lambda \right)\left(1-\xi \right)\hfill & \hfill \lambda +\left(1-\lambda \right)\xi \hfill \end{array}\right| $$
(26)

The components of the transition probability matrix are estimated using maximum likelihood (Marani 2003) from precipitation occurrence data as follows:

$$ {\widehat{p}}_{ab}=\frac{n\left(a,b\right)}{n(a)} $$
(27)

where n(a, b) is the number of times that rainfall in state a transitions to state b, and n(a) = n(a, 0) + n(a, 1).

3 Data description and application methodology

3.1 Observational data

We used daily precipitation data measured between 1950 and 2003 from the Denver International Airport (DIA) station (39.83° N, 104.66° W); we did not consider more recent data because the station had been moved. A year-long daily precipitation time series is presented in the top panel of Fig. 1; monthly and annual precipitation for all the data are shown in the middle and bottom panels, respectively. Daily precipitation indicates higher occurrence and amount during the summer (days 150–250). Therefore, we used different parameters each month to account for this annual cycle while applying the same power transformation exponent (c) for Eq. (2). Monthly precipitation data show a similar annual cycle, whereas yearly precipitation shows stationary variation.

Fig. 1
figure 1

Time series of daily (top panel) and monthly (middle panel) precipitation for 1999 and yearly (bottom panel) precipitation for all data (1950–2003)

3.2 Employed climate scenario

A climate scenario was used to illustrate how our parameter estimation method applies to climate change analysis. We selected RCP 8.5 from the Coupled Model Intercomparison Project phase 5 (CMIP5) from NOAA’s Geophysical Fluid Dynamics Laboratory (GFDL) to validate the performance of our method. The Bias-Correction Constructed Analogue (BCCA) method (Maurer et al. 2010) was applied for the downscaled precipitation data of RCP 8.5 (Brekke et al. 2013) (http://gdo-dcp.ucllnl.org/downscaled_cmip_projections/).

The variation in mean and standard deviation statistics was estimated from the climate scenario as follows:

$$ A\theta \left(\%\right)=\frac{\phi_{\mathrm{future}}-{\theta}_{\mathrm{ref}}}{\theta_{\mathrm{ref}}}\times 100 $$
(28)

where θ is the statistic for the future (θfuture) and reference (θref) period. The reference period is the portion of the observed climate with which climate change information is combined to create a climate scenario; we selected 1950–1999 as our reference period. The future period, 2010–2099, was separated into the following three parts: period 1, from 2010–2039; period 2, from 2040–2069; and period 3, from 2070–2099.

3.3 Application methodology

To validate the performance of our model, we simulated 100 sets of daily precipitation data with the same record length as the historical data. We compared the mean, standard deviation, skewness, lag-1 correlation, and marginal and transition probabilities of the occurrence process for historical and simulated daily precipitation data. We also compared the monthly and yearly aggregated levels of historical and simulated data for validation.

The statistics from the simulated data are presented with a boxplot, in which the box displays the interquartile range (IQR), and the whiskers extend to both extrema, with horizontal lines at the maximum and minimum of the estimated statistics of the 100 generated sequences. The horizontal line inside the box indicates the median of the data. In addition, the value of the statistic corresponding to the historical data is represented with a cross marker.

4 Results

4.1 Validation with observed data

To apply our method, we needed to choose an exponent for Eq. (2). We tested several exponents; “6” most reliably reproduced the key daily statistics of the observed data, including skewness. The power transformation was applied to eliminate the skewed distributional behavior of daily precipitation data; our results were produced with an exponent of c = 6.

We used two different parameter estimation methods (i.e., a direct method (M1) and an indirect method (M2)). The estimated parameters are presented in Table 1 for M1 and M2. Key statistics for the amount variable (Z) of historical and simulated data are presented in Figs. 2 and 3 for M1 and M2, respectively. Figure 2 shows that the mean, standard deviation, and lag-1 correlation for the simulated data are underestimated for almost all months with M1. In contrast, Fig. 3 shows that with M2, the mean and standard deviation are well reproduced. Skewness is not well preserved with either method. We tested several other exponents to improve skewness modeling unsuccessfully; there may not be a parameterization for skewness that resolves this bias.

Table 1 Estimated parameters from the AR-1 model for method 1 (M1) and method 2 (M2)
Fig. 2
figure 2

Key statistics of precipitation amount for historical (DIA station, dotted line with cross mark) and M1 simulated (boxplot) data. The box represents the interquartile range (IQR), whiskers are maximum and minimum, and the horizontal line is the median for 100 simulations

Fig. 3
figure 3

Key statistics of precipitation amount for historical (DIA station, dotted line with cross mark) and M2 simulated (boxplot) data. The box represents the interquartile range (IQR), whiskers are maximum and minimum, and the horizontal line is the median for 100 simulations

Figure 4 shows the transition and limiting probabilities for the occurrence variable; the occurrence process is well explained by the model. Note that using different parameter estimation methods for the amount process does not affect the occurrence process because the occurrence (X) and amount (Z) processes are modeled independently (see Eq. (1)).

Fig. 4
figure 4

Transition and limiting probabilities of precipitation occurrence for historical (DIA station, dotted line with cross mark) and M1 and M2 simulated (boxplot) data. The occurrence process (X) model was the same for M1 and M2, making the results the same. The box represents the interquartile range (IQR), whiskers are maximum and minimum, and the horizontal line is the median for 100 simulations

As shown in Fig. 5, underestimation of daily statistics from M1 propagates to the monthly time scale. The mean and standard deviation of simulated monthly data from M1 are significantly underestimated in most months. Nonetheless, the mean and standard deviation from M2 are well explained even for aggregate time scale (monthly) precipitation data, as presented in Fig. 6. Note that neither method explains the lag-1 correlation of historical monthly data.

Fig. 5
figure 5

Key statistics of monthly precipitation for historical (dotted line with cross mark) and M1 simulated (boxplot) data. The box represents the interquartile range (IQR), whiskers are maximum and minimum, and the horizontal line is the median for 100 simulations

Fig. 6
figure 6

Key statistics of monthly precipitation for historical (dotted line with cross mark) and M2 simulated (boxplot) data. The box represents the interquartile range (IQR), whiskers are maximum and minimum, and the horizontal line is the median for 100 simulations

Figure 7 shows the cumulative distribution functions (CDFs) estimated for annual maximum daily precipitation for historical and simulated data. We used the Weibull plotting position formula to estimate corresponding CDFs for annual maximum daily precipitation. M2 reproduces the historical CDF shape better than M1, specifically the tail, whereas the CDF produced by M1 is shifted to the left. We believe that the significant underestimation of daily mean and standard deviation statistics may have caused this shift. Figure 8 shows the CDF of maximum run days for historical and simulated data, indicating that the simple occurrence model properly explains the extreme statistics of the upper level time scale (i.e., annual).

Fig. 7
figure 7

Cumulative distribution function (CDF) of annual maximum daily precipitation for historical (dashed red line) and M1 (solid black line) and M2 (dotted gray line) simulated data. Note that longer series were used for the simulation (i.e., 500 instead of 51 years; equal to the historical record length) to evaluate the overall CDF behavior. Simulation CDFs are averaged over 100 simulated CDFs

Fig. 8
figure 8

Annual maximum dry days from historical (solid line) and simulated (dashed line) data

4.2 Downscaled climate precipitation data simulation

As mentioned, the change of the mean and standard deviation in the future downscaled climate scenario (RCP8.5) was estimated with the reference period of 1950–1999. Table 2 shows the differences in the mean and standard deviation between the reference and future periods. There is an increase in the mean and standard deviation of the future compared to the reference period, except in summer months (months 6 and 7) and month 2. The variations of the mean and standard deviation are similar for all periods. Mean differences are significantly increased during the winter months (months 11, 12, 1, and 2) in later future periods.

Table 2 Percent differences in the mean and standard deviation of precipitation data between the reference period (1950–1999) and three future periods (period 1, 2010–2039; period 2, 2040–2069; and period 3, 2070–2099)

Figures 9, 10, and 11 show the historical, future, and simulated (M2) statistics of the mean and standard deviation for periods 1, 2, and 3, respectively. The targeted future statistics are estimated by applying observational statistics to the reference statistics as follows:

Fig. 9
figure 9

Key statistics of monthly precipitation for historical (DIA station, dotted line with cross), climate change scenario (RCP 8.5, solid line with circle), and M2 simulated (boxplot) data for 2010–2039 (period 1). The box represents the interquartile range (IQR), whiskers are maximum and minimum, and the horizontal line is the median for 100 simulations

Fig. 10
figure 10

Key statistics of monthly precipitation for historical (DIA station, dotted line with cross), climate change scenario (RCP 8.5, solid line with circle), and M2 simulated (boxplot) data for 2040–2069 (period 2). The box represents the interquartile range (IQR), whiskers are maximum and minimum, and the horizontal line is the median for 100 simulations

Fig. 11
figure 11

Key statistics of monthly precipitation for historical (DIA station, dotted line with cross), climate change scenario (RCP 8.5, solid line with circle), and M2 simulated (boxplot) data for 2070–2099 (period 3). The box represents the interquartile range (IQR), whiskers are maximum and minimum, and the horizontal line is the median for 100 simulations

$$ {\theta}_{\mathrm{future}}=\varDelta \theta /100\times {\theta}_{\mathrm{obs}}-{\theta}_{\mathrm{obs}} $$
(29)

Note that the traditional parameter estimation method (M1) cannot reproduce the targeted future climate statistics because the parameters in M1 are estimated from the transformed data. When Eq. (29) is applied to the transformed parameters, the back-transformed data cannot be used because they produce inadequate values, such as those too large to be precipitation values (data not shown).

These figures indicate that our parameter estimation method adequately models key statistics (especially the mean and standard deviation) of the climate change scenario. The lag-1 correlation, which should not differ from that of the reference period, is reproduced well in all future periods. Skewness is still not modeled correctly. A significant increase in the mean is seen in winter months, especially for period 3 (see Fig. 11).

Figure 12 shows the mean and standard deviation of monthly precipitation data for future periods. There is a significant increase in the mean and standard deviation, compared to historical statistics, for month 5 of periods 1 and 3; there is no difference for period 2. The increase seen during winter months for daily precipitation (see Table 2) is not noticeable.

Fig. 12
figure 12

Mean (a-1, b-1, and c-1) and standard deviation (a-2, b-2, and c-2) of monthly precipitation for historical (DIA station, dotted line with cross) and M2 simulated (boxplot) data for (a) period 1, 2010–2039; (b) period 2, 2040–2069; and (c) period 3, 2070–2099. The box represents the interquartile range (IQR), whiskers are maximum and minimum, and the horizontal line is the median for 100 simulations

5 Summary and conclusions

We compared rainfall simulation models to DIA data. To adequately capture the intermittency of daily rainfall, the daily rainfall (Y process) is modeled separately from the occurrence (X) and amount (Z) processes. A simple Markov Chain is applied to the X process, and we focus mainly on the amount process, Z. We used the AR-1 model for each month of the study, applying a power transformation to normalize data. We compared the traditional parameter estimation method (M1) from transformed data with the indirect parameter estimation method (M2) from the relationship between the original and transformed moments.

We modeled historical precipitation data from DIA, CO, to test the performance of M1 and M2. Data simulated with M1 underestimate the mean, standard deviation, and lag-1 correlation, whereas the M2 simulated data model those statistics fairly well.

Further investigation for the applicability of M1 and M2 to climate change analysis was performed using downscaled RCP 8.5 data. The proposed M2 parameter estimation method models future climate statistics fairly well. M1 cannot be applied in the climate change study because it requires estimating parameters in the transformed domain, but the values produced are highly inadequate.

The M2 method is comparable to M1 in preserving key historical statistics and surpasses M1 in the climate change analysis. Additionally, M2 can also be applied to higher order ARMA models as well as to seasonal ARMA models. Although M2 was applied to daily precipitation data, it can easily be extended to other applications that use ARMA models with the power transformation.