Stochastic simulation of daily air temperature and precipitation from monthly normals in North America north of Mexico

Régnière, Jacques; St-Amant, Rémi

doi:10.1007/s00484-006-0078-z

Stochastic simulation of daily air temperature and precipitation from monthly normals in North America north of Mexico

Original Article
Published: 16 January 2007

Volume 51, pages 415–430, (2007)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

International Journal of Biometeorology Aims and scope Submit manuscript

Stochastic simulation of daily air temperature and precipitation from monthly normals in North America north of Mexico

Download PDF

Jacques Régnière¹ &
Rémi St-Amant¹

586 Accesses
102 Citations
Explore all metrics

Abstract

A simple, stochastic daily temperature and precipitation generator (TEMPGEN) was developed to generate inputs for the study of the effects of climate change on models driven by daily weather information when climate data are available as monthly summaries. The model uses as input only 11 sets of monthly normal statistics from individual weather stations. It needs no calibration, and was parameterized and validated for use in Canada and the continental United States. Monthly normals needed are: mean and standard deviation of daily minimum and maximum temperature, first and second order autoregressive terms for daily deviations of minimum and maximum temperatures from their daily means, correlation of deviations of daily minimum and maximum temperatures, total precipitation, and the interannual variance of total precipitation. The statistical properties and distributions of daily temperature and precipitation data produced by this generator compared quite favorably with observations from 708 stations throughout North America (north of Mexico). The algorithm generates realistic seasonal patterns, variability and extremes of temperature, precipitation, frost-free periods and hot spells. However, it predicts less accurately the daily probability of precipitation, extreme precipitation events and the duration of extreme droughts.

Stochastic investigation of daily air temperature extremes from a global ground station network

Article 12 April 2021

Estimating daily climatological normals in a changing climate

Article 15 December 2018

Stochastic generators of multi-site daily temperature: comparison of performances in various applications

Article 16 February 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Progress in forecasting the impacts of climate change on forest ecosystems (stand growth, succession sequences) and disturbance regimes (insect and disease outbreaks, fire frequencies) involves the ability to simulate extreme weather events. Extreme events are often sequences ranging from a few to several successive days of unusually hot, cold, dry or wet weather. Many researchers are developing climate-sensitive models of ecological processes that require as input daily or even hourly time series of temperature and precipitation (Prentice et al. 1993; Kimball et al. 1997; Friend 1998; Lasch et al. 2002; Régnière and Nealis 2002; Logan et al. 2003; Gray 2004). Typically, however, climate change scenarios are generated from single runs of Global Circulation Models (CGMs) with daily output at coarse spatial resolution. Because this output represents a single realization from a large set of possible outcomes, the possibility of statistical treatment (replication) is severely limited (Oelschlägel 1995). However, the output of CGMs can be summarized as monthly statistics (normals) or as “anomalies” (deviations from base-line normals) and disaggregated through stochastic daily weather generators (Bürger 1997; Hansen and Ines 2005). The low spatial resolution of GCMs can be increased through regional models or other down-scaling approaches, and monthly statistics can be input into ecological models to take into account climate change scenarios (Fowler et al. 2005; Qian et al. 2005).

The major issue in the development of stochastic daily weather generators is to obtain a sequence of daily weather that conforms not only to observed (or expected) mean values, but also to natural variability, extremes, and time series characteristics (Wallis and Griffiths 1995). Several approaches to generation of daily weather information have been developed, and most are based on the simulation of alternating dry- and wet-day series using Markov-chain transition probabilities (Richardon 1981; Racsko et al. 1991; Hutchinson 1995a; Katz 1996; Wilks 1999). Many of these models, however, require considerable amounts of input information and often must be calibrated for application in specific geographical areas (Richardson and Wright 1984; Racsko et al. 1991; Semenov and Barrow 1997; Bürger 1997; Semenov et al. 1998). There has also been considerable attention focused on adding interannual variation in these generators for ecological applications (Mearns et al. 1997; Hansen and Mavromatis 2001; Mavromatis and Hansen 2001; Dubrovský et al. 2004).

Régnière and Bolstad (1994) developed an algorithm (TEMPGEN) for simulation of daily minimum and maximum air temperature based on normals (30-year monthly average and extreme minimum and maximum temperatures). This algorithm generates realistic distributions, autocorrelation and cross-correlation patterns of daily minimum, maximum temperatures and ranges. It has been quite useful in studying the responses of models driven by daily weather information under “normal” weather conditions and has been used in many applications through the BioSIM system (Régnière 1996). However, this generator has three weaknesses: (1) it does not take into account the interannual variation of monthly average temperatures, and therefore generates temperature traces that are not as variable as they are in nature; (2) it does not generate daily precipitation; and (3) it was parameterized from data from the Atlantic coastal area of the United States, and as such is not applicable in other areas.

The objective of the work reported here was to remedy these three weaknesses by generalizing and validating the Régnière and Bolstad (1994) algorithm for the entire North American continent (north of Mexico) and adding daily precipitation to its output. We sought a model that would require minimal input (in the form of local monthly normal statistics) would need no for calibration, and would generate realistic daily times series of minimum and maximum air temperature and precipitation, exhibiting levels of variability and statistical distributions close to those of observed time series.

Methods

Algorithm structure

Throughout this document, temperatures are expressed in °C and precipitation in millimeters of water. The symbol ∼ above a term designates a 30-year normal while ¯ represents an average of some other type. Inputs of this model are 11 sets of monthly statistics extracted over a standard normal period (SNP) of 30 years of daily observations from a given location. The first two sets of normals are mean daily minimum ($ \widetilde{\mu }_{n} $) and maximum ($ \widetilde{\mu }_{x} $) temperatures. From these, normal daily minimum and maximum temperatures are obtained by linear interpolation between the means of successive months, adjusted to correspond to the values occurring at mid-month. Seven additional monthly normals are calculated from the differences between observed daily minimum and maximum temperatures and these daily normals: their standard deviations ($ \widetilde{\sigma }_{\delta } $ for minimum and $ \widetilde{\sigma }_{\varepsilon } $ for maximum temperature); 2nd order autoregressive terms describing their time-series characteristics ($ \widetilde{A}_{\delta } $, $ \widetilde{B}_{\delta } $ for minimum and $ \widetilde{A}_{\varepsilon } $, $ \widetilde{B}_{\varepsilon } $ for maximum temperatures); and their cross-correlation ($ \widetilde{r}_{{\delta \varepsilon }} $). Two additional monthly statistics describe the characteristics of precipitation: average monthly total ($\widetilde{P}$); and the standard deviation ($ \widetilde{\sigma }_{P} $) of the ratio of observed monthly total to normal precipitation, given by:

$$\widetilde{\sigma }_{P} = {\sqrt {{{\sum\limits_i {{\left( {\frac{{P_{m} }}{{\widetilde{P}}} - 1} \right)}^{2} } }} \mathord{\left/ {\vphantom {{{\sum\limits_i {{\left( {\frac{{P_{m} }}{{\widetilde{P}}} - 1} \right)}^{2} } }} {{\left( {n - 1} \right)}}}} \right. \kern-\nulldelimiterspace} {{\left( {n - 1} \right)}}} }$$

(1)

where P _m is a month’s total precipitation in year m and n=30 years.

Daily minimum and maximum temperatures are generated with the algorithm of Régnière and Bolstad (1994):

$$ \begin{array}{*{20}c} {T_{n} {\left( t \right)} = \widetilde{T}_{n} {\left( t \right)} + \delta {\left( t \right)}} \\ {T_{x} {\left( t \right)} = \widetilde{T}_{x} {\left( t \right)} + \varepsilon {\left( t \right)}} \\ \end{array} $$

(2)

where $\widetilde{{T_{n} }}$(t) and $\widetilde{{T_{x} }}$(t) are daily normal minimum and maximum temperatures (obtained by linear interpolation as described above), and δ(t) and ɛ(t) are auto-correlated (2nd order) and cross-correlated random numbers with means of 0 and variances $ \widetilde{\sigma }^{2}_{\delta } $ and $ \widetilde{\sigma }^{2}_{\varepsilon } $. The autocorrelation and cross-correlation are generated through general linear stochastic processes expressed as:

$$\begin{array}{*{20}c} {\delta {\left( t \right)} = {\sum\limits_i {\alpha _{i} \gamma {\left( {t - i} \right)}} }} \\ {\varepsilon {\left( t \right)} = \frac{{\widetilde{\sigma }_{\varepsilon } }}{{\widetilde{\sigma }_{\delta } }}{\sum\limits_i {\beta _{i} } }{\left[\,\, {\rho \gamma {\left( {t - i} \right)} + {\left( {1 - {\left| \rho \right|}} \right)}\zeta {\left( {t - i} \right)}} \right]}} \\ \end{array} $$

(3)

where α _i and β _i are infinite series of coefficients such that:

$$ \begin{array}{*{20}c} {\alpha _{1} = 1,\alpha _{2} = \widetilde{A}_{\delta } ,{\text{and }}\alpha _{i} = \widetilde{A}_{\delta } \alpha _{{i - 1}} + \widetilde{B}_{\delta } \alpha _{{i - 2}} {\text{for i $>$ 2}}} \\ {\beta _{1} = 1,\beta _{2} = \widetilde{A}_{\varepsilon } ,{\text{and }}\beta _{i} = \widetilde{A}_{\varepsilon } \beta _{{i - 1}} + \widetilde{B}_{\varepsilon } \beta _{{i - 2}} {\text{for i $>$ 2}}} \\ \end{array} $$

(4)

and $ \gamma \in {\text{Nor}}{\left( {0,\sigma ^{2}_{\gamma } } \right)} $, $ \zeta \in {\text{Nor}}{\left( {0,\sigma ^{2}_{\zeta } } \right)} $. The formulation of Eq. 3 is a generalization of Régnière and Bolstad’s (1994) model 2, where ρ can be in the range [−1,1] and ɛ can have any variance regardless of the values of $ \widetilde{A}_{\delta } $, $ \widetilde{B}_{\delta } $, $ \widetilde{A}_{\varepsilon } $, $ \widetilde{B}_{\varepsilon } $ and ρ. The variances $ \widetilde{\sigma }^{2}_{\gamma } $ and $ \widetilde{\sigma }^{2}_{\zeta } $ are given by:

$$ \sigma ^{2}_{\gamma } = \frac{{\widetilde{\sigma }^{2}_{\delta } }} {{{\sum\limits_i {\alpha ^{2}_{i} } }}} $$

(5)

and

$$ \sigma ^{2}_{\zeta } = \frac{{\widetilde{\sigma }^{2}_{\delta } }} {{{\left( {1 - {\left| \rho \right|}} \right)}^{2} }}{\left( {\frac{1} {{{\sum\limits_i {\beta ^{2}_{i} } }}} - \frac{{\rho ^{2} }} {{{\sum\limits_i {\alpha ^{2}_{i} } }}}} \right)} $$

(6)

and the value of ρ is:

$$ \rho = \widetilde{r}_{{\delta \varepsilon }} \frac{{{\sum\limits_i {\alpha ^{2}_{i} } }}} {{{\sum\limits_i {{\left( {\alpha _{i} \beta _{i} } \right)}} }}} $$

(7)

(see Régnière and Bolstad 1994, Appendix 2, for the derivation of Eqs. 5–7). The sums $ {\sum {\alpha ^{2} } } $, $ {\sum \beta }^{2} $ and $ {\sum {{\left( {\alpha \beta } \right)}} } $ converge within the first 20 terms. In the original model of Régnière and Bolstad (1994), $ \widetilde{A}_{\delta } $, $ \widetilde{B}_{\delta } $, $ \widetilde{A}_{\varepsilon } $, $ \widetilde{B}_{\varepsilon } $ and $ r_{{\delta \varepsilon }} $ were calculated from station latitude and time of year. Observed extreme minimum and maximum observations were also needed to calculate $ \widetilde{\sigma }_{\delta } $ and $ \widetilde{\sigma }_{\varepsilon } $, based on the analysis of data from the Atlantic area of the United States. In the present model, they are input station normals.

Contrary to most daily weather generators where daily precipitation is the basis for simulation of temperature values (Katz 1996), in this model air temperature is generated first, and precipitation is computed on the basis of daily temperatures, with the underlying assumption that precipitation is more likely on days with smaller temperature ranges. Total monthly precipitation (P) is generated by drawing a multiplicative random deviation ΔP ≥ 0 from a family of Weibull probability distribution functions entirely parameterized from $ \widetilde{\sigma }_{P} $:

$$ P = \Delta P \cdot \widetilde{P} $$

(8)

The probability of daily precipitation p(t) is generated from total monthly precipitation P, its variance $ \sigma ^{2}_{P} $ and the daily temperature range $ R{\left( t \right)} = T_{x} {\left( t \right)} - T_{n} {\left( t \right)} $. The value of p(t) is then compared with a uniformly distributed random number $ \psi \in \mathcal{U}{\left( {0,1} \right)} $. Whenever p(t) < ψ, p(t) is reset to 0. For days where p(t) > 0, the amount of daily precipitation P(t) is calculated as follows. The proportion x(t) of total monthly precipitation falling on day t is obtained from a Beta distribution specified entirely from the month’s $ \widetilde{\sigma }_{P} $ and that day’s precipitation is given by:

$$ P(t) = x(t)P $$

(9)

where $ {\sum\limits_t {x{\left( t \right)}} } = 1 $ within a month.

Data sources

Monthly normals $ {\left( {\widetilde{\mu }_{n} ,\widetilde{\mu }_{x} ,\widetilde{\sigma }_{\delta } ,\widetilde{\sigma }_{\varepsilon } ,\widetilde{A}_{\delta } ,\widetilde{B}_{\delta } ,\widetilde{A}_{\varepsilon } ,\widetilde{B}_{\varepsilon } ,\widehat{r}_{{\delta \varepsilon }} ,\widetilde{P}\;{\text{and}}\;\widetilde{\sigma }_{P} } \right)} $ for stations in North America were calculated over the 1961–1990 SNP from daily records contained in two databases. For Canada, data from Atmospheric Environment Service Daily Climate Data CD-ROMs http://www.msc-smc.ec.gc.ca/) were used. For the United States (excluding Alaska, Atlantic and Pacific Island states), the NCDC COOP Summary of the Day TD3200 database http://www.ncdc.noaa.gov/) was used. Only stations with at least 20 years of data for each month during the SNP were retained. For these stations, a month’s data were retained only if it contained at least 20 days of non-missing temperature or precipitation data. Total precipitation in months with missing data was adjusted (P=P _unadjusted×days in month/non-missing days in months).

Variation of monthly mean temperature and total precipitation

Annual variation of monthly mean temperature and precipitation was analyzed using data from four clusters of weather stations representing a wide range of North American climates: 156 in Quebec, 129 in British Columbia, 118 in California and 108 in Tennessee and North Carolina. Deviations of monthly mean temperature (μ) and total precipitation (P) from 30-year normals $\left[ {\widetilde\mu = {{\left( {\widetilde\mu _n + \widetilde\mu _x } \right)} \mathord{\left/ {\vphantom {{\left( {\widetilde\mu _n + \widetilde\mu _x } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}} \right]\;\text{and}\;\widetilde P$ were calculated:

$$ \begin{array}{*{20}l} {{\Delta \mu = \mu - \widetilde{\mu }} \hfill} \\ {{{\Delta P = P} \mathord{\left/ {\vphantom {{\Delta P = P} {\widetilde{P}}}} \right. \kern-\nulldelimiterspace} {\widetilde{P}}} \hfill} \\ \end{array} $$

(10)

To detect month-to-month autocorrelation, values of Δμ and ΔP were averaged by year and month over all stations in each cluster. This provided eight 360-month time series: four of Δμ and four of ΔP. Autocorrelation in each of these time series was calculated for lags of 1–12 months. Cross-correlations (without lag) between deviations in temperature and precipitation were also tested for significance.

The distribution of monthly precipitation deviations ΔP changed consistently from month to month, and the resulting patterns were different between station clusters. The Weibull probability distribution function (see Hastings and Peacock 1975) was chosen as a suitably flexible model to describe these patterns:

$$ p{\left( {\Delta P \leqslant X} \right)} = 1 - e^{{ - {\left[ {{{\left( {X + 0.05} \right)}} \mathord{\left/ {\vphantom {{{\left( {X + 0.05} \right)}} \lambda }} \right. \kern-\nulldelimiterspace} \lambda } \right]}^{\eta } }} $$

(11)

where X ≥ 0. The value 0.05 was added to X to ensure that months with no precipitation could have a non-zero probability of occurring (p > 0 when X = 0).

Equation 11 was fitted to the cumulative distribution of ΔP for each month, pooled separately over stations from each cluster. Parameters η and λ were estimated for the 48 distributions (12 months, for stations in Quebec, British Columbia, California and Tennessee/North Carolina) by non-linear least squares regression (SAS release 8.04, PROC NLIN). Parameter η was highly correlated to the monthly values of $ \overline{\sigma } _{P} = \tfrac{1} {n}{\sum {\widetilde{\sigma }_{P} } } $ averaged over all stations (by cluster and month). The relationship was expressed as:

$$\widehat{\eta } = a_{\eta } + b_{\eta } /\overline{\sigma } _{P} $$

(12)

where $\widehat{\eta }$ is an estimate of η given $ \overline{\sigma } _{P} $. Parameters a _η and b _η were estimated by linear regression. There was also a strong relationship between the values of η and λ, which we modeled with:

$$ \widehat{\lambda } = a_{\lambda } {\left( {1 - e^{{ - b_{\lambda } \widehat{\eta }}} } \right)}^{{c_{\lambda } }} $$

(13)

where $ \widehat{\lambda } $ is an estimate of λ; a _λ, b _λ and c _λ are parameters that were estimated by non-linear least squares regression. The probability distribution of total monthly precipitation for a given station can be determined from $\widetilde{P}$ and $ \widetilde{\sigma }_{P} $ with Eqs. 11, 12 and 13.

Generation of daily precipitation

For this analysis, daily air temperature and precipitation records were compiled from the four clusters of weather stations described above. Daily probabilities of precipitation were related through a purely empirical model to three variables: total monthly precipitation P, its interannual variability $ \widetilde{\sigma }_{P} $, and daily temperature range R(t). The distinction between wet and dry days is made on the basis of these probabilities. Total monthly precipitation is partitioned among wet days through a family of Beta distributions.

The probability of daily precipitation p(t) on day t was modeled with:

$$ g{\left( p \right)} = k_{0} + k_{1} \log _{{10}} {\left( {P + 1} \right)} + k_{2} \log _{{10}} W + k_{3} \widetilde{\sigma }_{P} $$

(14)

where

$$ g{\left( p \right)} = \ln {\left[ {p \mathord{\left/ {\vphantom {p {{\left( {1 - p} \right)}}}} \right. \kern-\nulldelimiterspace} {{\left( {1 - p} \right)}}} \right]} $$

(15)

is the logistic transform of p, and W is a Weibull probability density function of daily temperature range R(t):

$$ W = \frac{{\omega _{2} }} {{\omega _{1} }}{\left( {\frac{{R{\left( t \right)} + 1}} {{\omega _{1} }}} \right)}^{{\omega _{2} - 1}} e^{{ - {\left( {\frac{{R(t) + 1}} {{\omega _{1} }}} \right)}^{{\omega _{2} }} }} $$

(16)

with four parameters:

$$ \omega _1 = k_4 + k_5 \,\widetilde\sigma _p \;\text{and}\;\omega _2 = k_6 + k_7 \;\widetilde\sigma _P . $$

(17)

Equations 14–17 were fitted to observations in the following manner. Days with precipitation exceeding 0.1 mm were assigned a precipitation index of 1 (otherwise 0). Precipitation indices were averaged, providing values of p(t) for classes of P, R(t) and $ \widetilde{\sigma }_{P} $. P values were grouped into 5 mm/month classes from 0 to ≥250 mm/month. R(t) values were grouped into 1°C classes from ≤3 to ≥25°C. $ \widetilde{\sigma }_{P} $ values were grouped into 14 classes: 0–0.3, 0.3–0.4, ..., 0.9–1.0, 1.0–1.5, ..., 3.0–3.5, >3.5. Classes containing fewer than 10 observations were dropped, leaving a total of 9,821 distinct combinations of the independent variables. Parameters k ₀–k ₇ were estimated simultaneously by non-linear logistic regression analysis.

The distribution of the proportion of total monthly precipitation falling on a given rainy day t, the ratio P(t)/P where P(t) > 0.1 mm, was approximated by a family of Beta distributions Beta(v,w) where shape parameters v and w are estimated from the mean and uncorrected variance of the P(t)/P ratio (see Hastings and Peacock 1975). The data set was subdivided into 14 groups according to the values of $ \widetilde{\sigma }_{P} $ (classes listed above), v and w were calculated by month within each class and estimates of w were related to $ \widetilde{\sigma }_{P} $ (the average $ \widetilde{\sigma }_{P} $ within each group) with:

$$ w = m_{0} + \frac{{m_{1} }} {{{\left( {\overline{\sigma } _{P} + 1} \right)}^{{m_{2} }} }} $$

(18)

Parameters m ₀–m ₂ were estimated by non-linear least-squares regression. There was also a curvi-linear relationship between v and w expressed with:

$$ v = {m_{3} + m_{4} w + m_{5} } \mathord{\left/ {\vphantom {{m_{3} + m_{4} w + m_{5} } w}} \right. \kern-\nulldelimiterspace} w $$

(19)

where m ₃–m ₅ are parameters estimated by linear regression. From relationships in Eqs. 18 and 19, a Beta distribution of the P(t)/P ratio can be assigned for any value of $ \widetilde{\sigma }_{P} $.

Model validation

Data from one station in each of the four clusters used in model parameterization were used for a detailed comparison of observed and simulated patterns of monthly variation and distributions. These stations were selected because they had the least missing data: Summerland BC (49°34′N, 119°39′W), Quebec QC (45°48′N, 71°23′W), Button Willow CA (35°24’N, 119°28W) and Knoxville TN (35°49′N, 83°59′W) (Fig. 1). For each station, ten observed and simulated monthly statistics were compared: extreme and mean minimum and maximum temperature, the standard deviations of mean temperature and total precipitation, average daily precipitation and probability of precipitation, and the duration of droughts and hot spells. Droughts were defined as periods of successive days with precipitation < 0.1 mm. Hot spells were defined as periods of successive days with maximum temperature ≥30°C. Seven observed and simulated frequency distributions were compared: daily minimum and maximum temperatures, daily temperature range, frost-free period, duration of droughts and hot spells, and amount of daily precipitation.

Normals and daily data over the entire 1961–1990 SNP were obtained for a set of 708 North American weather stations. This set was selected at random by a thinning algorithm eliminating neighboring stations within 100 km of each other, so that selected stations were distributed as evenly as possible across Canada and the continental United States, conditional on station density. Station density was very low in northern Canada (Fig. 1). From the monthly normals of each station ($\widetilde{\mu }_{n} ,\widetilde{\mu }_{x} ,\widetilde{\sigma }_{\delta } ,\widetilde{\sigma }_{\varepsilon } ,\widetilde{A}_{\delta } ,\widetilde{B}_{\delta } ,\widetilde{A}_{\varepsilon } ,\widetilde{B}_{\varepsilon } ,\widetilde{r}_{{\delta \varepsilon }} ,\widetilde{P}\;{\text{and}}\;\widetilde{\sigma }_{P} $), daily temperature and precipitation data (300 years) were generated with the daily weather generator and simulator-generated monthly normals for each station were compiled. Six additional observed and simulated monthly statistics were calculated by station: mean extreme minimum and maximum temperatures, average daily temperature range, average daily precipitation and probability of precipitation, and the standard deviation of mean temperature, designated $ \sigma _{\mu } $. Observed and simulated annual frost-free periods were compiled. Five extreme events were extracted from observed and simulated frequency distributions compiled by station over the entire observation (30 years) or simulation (300 years) period: the coldest and hottest temperatures (5th percentile lowest minimum and 95th percentile highest maximum temperatures), the heaviest daily precipitation events (95th percentile), the longest droughts (95th percentile) and the longest hot spells (95th percentile).

Results

Data analysis

Autocorrelation

There was a weak but significant 1-month lag autocorrelation in the monthly mean deviations from normals temperature (Δμ) among stations from British Columbia (r = 0.229, P < 0.001), California (r = 0.159, P = 0.003), Tennessee/North Carolina (r = 0.131, P = 0.014), but not from Quebec (r = 0.067, P = 0.201). There was no significant autocorrelation in deviations of total monthly precipitation from normals (ΔP) among stations of any cluster normals at any time lag. Thus, the simulator was designed without a built-in autocorrelation structure in monthly deviations from monthly mean temperature or total precipitation. There was a weak correlation between deviations from monthly mean temperatures and deviations from monthly total precipitation among time series from British Columbia (r = −0.149, P < 0.001), Quebec (r = 0.045, P < 0.001) and California (r = −0.130, P < 0.001), but none from Tennessee/North Carolina (r = −0.004, P = 0.454). Because of the weakness and ambiguity of this association, no attempt was made to correlate average monthly temperature and total precipitation in the simulator.

Precipitation

Equations 12 and 13 described precisely the relationships between η and $ \overline{\sigma } _{P} $ (Fig. 2a), and λ and η (Table 1; Fig. 2b). The family of Weibull distribution functions (Eq. 11) generated by these relationships approximated quite well the patterns of monthly deviations of total precipitation from normals, ΔP, among data from stations from the four clusters: British Columbia (Fig. 3a), Quebec (Fig. 3b), California (Fig. 3c) and Tennessee/North Carolina (Fig. 3d).

Table 1 List of parameter estimates for Eqs. 12–19 of the simulator

Full size table

There was a clear diminishing-return relationship between the daily probability of precipitation p(t) and total monthly precipitation P (Fig. 4a). There was also a strong relationship between p(t) and daily temperature range R(t) (Fig. 4b), confirming the initial underlying assumption that the likelihood of precipitation is inversely proportional to R(t). These relationships were modulated strongly by variability of monthly precipitation ($ \widetilde{\sigma }_{P} $). Equation 14 provided a good description of these relationships (Table 1).

The Beta distribution captured well the changes in distribution of the ratio P(t)/P, the proportion of a month’s total precipitation falling on a given day, with respect to $ \widetilde{\sigma }_{P} $. Equations 18 and 19 accurately described the relationships between the v and w parameters of the Beta(v, w) distribution and $ \widetilde{\sigma }_{P} $ (Table 1; Fig. 5).

Model validation

Simulated and observed monthly temperature fluctuations ($ \widetilde{\mu }_{n} ,\widetilde{\mu }_{x} $ and mean extreme minimum and maximum) were nearly identical at the four stations selected for detailed examination (Summerland BC, Quebec QC, Knoxville TN and Button Willow CA; Fig. 6a). Seasonal fluctuations of the variance of mean monthly temperature and total precipitation ($ \sigma _{\mu } $ and $ \widetilde{\sigma }_{P} $) were also well mimicked by the model (Fig. 6b). However, the simulated values of $ \sigma _{\mu } $ tended to be lower than observed (closed circles, Fig. 6b). Simulated and observed daily precipitation averages were nearly identical (Fig. 6c). Monthly variations of the probability of simulated daily precipitation varied in good general agreement with observations (Fig. 6d). The simulated seasonal patterns of drought and hot-spell duration were very similar to observations, although simulated droughts tended to be shorter than observed in Button Willow CA (Fig. 6e,f).

The distributions of observed and simulated daily minimum and maximum temperatures and of daily ranges were nearly identical (Fig. 7a,b). The distribution of the frost-free period was well simulated by the model in three of the four stations (Fig. 7c). In Button Willow CA, observed frost-free periods were about 10 days longer than simulated. The distribution of hot spell duration was well mimicked by the model in all four locations (Fig. 7d). While the distributions of simulated and observed single-day precipitation amounts (Fig. 7e) and droughts (Fig. 7f) were quite similar in three of the four stations, both simulated distributions tended toward higher values than observed in Button Willow CA.

Over the 708 stations in the validation data set, 9 of the 11 monthly normals calculated from model output were for all practical purposes identical to observed normals in terms both of means and variances ($\widetilde{\mu }_{n} ,\widetilde{\mu }_{x} ,\widetilde{\sigma }_{\delta } ,\widetilde{\sigma }_{\varepsilon } ,\widetilde{A}_{\delta } ,\widetilde{B}_{\delta } ,\widetilde{A}_{\varepsilon } ,\widetilde{B}_{\varepsilon } ,\widetilde{r}_{{\delta \varepsilon }} $; Table 2). Simulated total monthly precipitation $\widetilde{P}$ was slightly lower than observed, with a mean error (ME) of 0.3 ± 2.7 mm/month and a mean absolute error (MAE) of 1.7 ± 1.9 mm/month. However, these differences were very small (0.5% on average) in comparison to the observed average monthly precipitation of 63.6 ± 47.8 mm/month and the correlation between observed and average total monthly precipitation was very high (r = 0.998). While the ME and MAE between observed and simulated variation of monthly total precipitation $ \widetilde{\sigma }_{P} $ were very small (Table 2), the correlation between them was not as high as for other normal statistics (r = 0.984). This indicates that the model realistically mimics this variation, but does not do this exactly, especially for higher values of $ \widetilde{\sigma }_{P} $ (Fig. 8a).

Table 2 Comparison of observed and simulated statistics from 708 randomly-selected weather stations

Full size table

The model generated accurate average monthly mean extreme minimum temperatures (Table 2) that were nearly unbiased (ME:−0.3 ± 0.9°C; MAE: 0.7 ± 0.7°C) and were highly correlated with observations (r = 0.998). Simulated average monthly maximum temperature was clearly biased (ME:1.1 ± 1.0°C, MAE 1.2 ± 1.2°C), despite a high correlation between observed and simulated values (r = 0.996; Fig. 8b). To explain this bias, the distributions of observed and simulated deviations of maximum temperature from daily averages, ɛ(t) in equation [2], were compared for the four stations selected for detailed analysis. While the distributions of simulated deviations were nearly normal, observed deviations had lower than expected frequencies at the higher extremes (> 98th percentile), in all four stations and throughout the year (Fig. 9). This difference in distributions between simulated and observed ɛ(t) is the cause of the bias in mean monthly extreme maximum temperatures output by the model. Limiting the values of ɛ(t) to the range $ \pm 1.96\,\widetilde{\sigma }_{\varepsilon } $ in the calculation of T _x(t) (Eq. 2) corrects this bias very efficiently without impacting other output characteristics.

The standard deviations of mean monthly temperature $ \sigma _{\mu } $ generated by the model were smaller than observed (ME:−0.3 ± 0.3; MAE:0.3 ± 0.3; Table 2), although the correlation between observed and simulated values was high (r = 0.966). This bias represents a 15% difference on average (Fig. 8c) and was most pronounced at higher values of $ \sigma _{\mu } $ (Fig. 8c). In the model, this variation results entirely from the variances $ {\left( {\widetilde{\sigma }_{\varepsilon } ,\widetilde{\sigma }_{\delta } } \right)} $, autocorrelation $ {\left( {\widetilde{A}_{\delta } ,\widetilde{B}_{\delta } ,\widetilde{A}_{\varepsilon } ,\widetilde{B}_{\varepsilon } } \right)} $ and cross-correlation $ {\left( {\widetilde{r}_{{\delta \varepsilon }} } \right)} $ structure of the stochastic processes δ and ɛ in Eq. 3. There was a strong spatial pattern in observed values of $ \sigma _{\mu } $ over the North American continent, with the higher values occurring in the Prairie Provinces and around Hudson’s Bay in Canada (Fig. 10a). The differences between model output and observations being proportional to $ \sigma _{\mu } $, they had a very similar spatial pattern (Fig. 10b).

The average daily ranges and daily precipitation generated by the model were nearly identical to observations, with no bias and very high correlation coefficients (Table 2). However, average simulated daily probability of precipitation was lower than observed (ME: 0.020 ± 0.083; MAE: 0.067 ± 0.052; Table 2). Because of the high variability of daily precipitation probability averaged by month and station, the correlation between simulated and observed means was only 0.84 (Fig. 8d). The daily probability of precipitation in North America is strongly patterned: it is highest in the Pacific Northwest and the Northeast (especially northern Ontario and Quebec), and lowest in the Southwest (Fig. 10c). Differences between model output and observations were also patterned spatially (Fig. 10d). The largest negative differences (simulated < observed) were located in the mountains of the West, the Canadian Prairies and at the latitude of James Bay in eastern Canada. Positive differences were concentrated in the eastern half of the continent, south of the Great Lakes, and along the Southern half of Canada’s Atlantic Coast.

The model generated highly realistic distributions of extreme temperatures (5th percentile lowest daily minimum and 95th percentile highest daily maximum). The values of these two statistics generated by the model were highly correlated with observations, with low ME and MAE (Table 2). Average frost-free periods were also quite well predicted, with no bias and a small MAE of 4.9 ± 4.5 days (Fig. 8e). The model predicted reasonably well the duration of extreme hot spells (95th percentile longest) with an MAE of only 2.6 ± 3.5 days, relative to the observed average of 15.0 ± 18.8 days. The correlation between simulated and observed extreme hot spells was high (r = 0.979; Fig. 8f). However, the model did not perform so well in predicting extreme daily precipitation (95th percentile highest values) or extreme drought durations (95th percentile longest). The model generated extreme precipitation events that were only slightly larger than observed on average (ME: 2.7 ± 9.5 mm/day), but the the MAE was quite high (6.9 ± 7.0 mm/day) relative to the observed average of 25.0 ± 12.0 mm/day (Fig. 8g). Extreme precipitation events have a strong spatial pattern in North America, and are largest in the Pacific Northwest and around the Gulf of Mexico (Fig. 10e). The model tended to predict heavier extreme precipitation than observed in the western half of the United States, and lighter than observed in the Southeast (Fig. 10f). Average predicted extreme drought duration was unbiased (ME: 0.1 ± 7.5 days) but absolute differences between model output and observations were high (MAE: 4.8 ± 5.8 days) compared to the observed average (18.1 ± 13.5 days; Fig. 8h). The model underestimated the extreme summer droughts observed in the southwestern corner of the continent, but overestimated those over the high-elevation terrain of much of the western United States (Fig. 10g,h).

Discussion

Two of the objectives of this modification of Régnière and Bolstad’s (1994) TEMPGEN algorithm were to generalize it to the entire North American continent, north of Mexico, and to generate realistic interannual variability in monthly means. These objectives were achieved in two ways: (1) by replacing empirical relationships between station latitude and the variance, autocorrelation and cross correlation of daily minimum and maximum temperature time series by input station-specific monthly statistics ${\left( {\widetilde{\sigma }_{\delta } ,\widetilde{\sigma }_{\varepsilon } ,\widetilde{A}_{\delta } ,\widetilde{B}_{\delta } ,\widetilde{A}_{\varepsilon } ,\widetilde{B}_{\varepsilon } ,\,\widetilde{r}_{{\delta \varepsilon }} } \right)}$; and (2) by generalizing the stochastic processes (Eq. 3) that generate the daily fluctuations around normal monthly (mean) temperatures $ {\left( {\widetilde{\mu }_{n} ,\widetilde{\mu }_{x} } \right)} $. Comparison of the characteristics of observed and simulated daily temperature time series indicates that these objectives have been met successfully in terms of both seasonality and accuracy. The model also produces highly realistic frost-free periods and extreme events such as very low or very high temperatures and hot spells.

The model-generated interannual variation of monthly mean temperatures, which results entirely from the stochastic processes generating daily minimum and maximum temperature time series, is 15% smaller than observed variability. This suggests that other sources of variability are at play in observed temperature fluctuations, such as low-frequency fluctuations including the El Niño/Southern Oscillation (Philander 1990), the North Atlantic Oscillation (Watanabe and Nitta 1999) and the Pacific Decadal Oscillation (Latif and Barnett 1994). No attempt has been made at this point to include such synoptic low-frequency sources of temperature variability in TEMPGEN. Yet low-frequency interannual variability can have important consequences in ecological systems and needs to be taken into consideration in the generation of daily weather data (Hansen and Mavromatis 2001; Mavromatis and Hansen 2001; Dubrovský et al. 2004). One simple way to achieve this without recourse to models of those complex climate oscillation processes would be to apply a suitably small amount of stochastic annual variation simultaneously to monthly mean minimum and maximum temperatures $ {\left( {\widetilde{\mu }_{n} ,\widetilde{\mu }_{x} } \right)} $.

The third objective of this work was to add daily precipitation to TEMPGEN’s output. Many of the daily weather generators developed since the 1950s (Bruhn 1980; Richardon 1981; Racsko et al. 1991; Johnson et al. 1996; Kimball et al. 1997; Friend 1998; Skiles and Richardson 1998; Jones and Thornton 2000; Yu 2003; Hansen and Ines 2005) are based on the simulation of precipitation using Markov-chain transitions between wet and dry states (Chin 1977), and Alpha distributions of the daily amount of precipitation. The TEMPGEN generator here is not based on a distinction between wet and dry days. It approaches precipitation on the basis of an empirical relationship between probability of precipitation and daily temperature range. Rather than describing transitions between wet and dry periods, TEMPGEN’s wet and dry spells are the result of the autocorrelation and cross-correlation structure of minimum and maximum daily temperatures. The distribution of dry spell durations is an emergent property, conditioned by expected total monthly rainfall and its interannual variability, both station-specific model inputs $ {\left( {\widetilde{P},\widetilde{\sigma }_{P} } \right)} $. The variability $ \widetilde{\sigma }_{P} $ is used to generate stochastic values of total monthly precipitation. From these and daily temperature ranges, the daily probability of precipitation is calculated. The proportion of total monthly precipitation falling on any given day is a Beta-distributed random variable with parameters entirely specified by the input value of $ \widetilde{\sigma }_{P} $. In very dry climates (high $ \widetilde{\sigma }_{P} $), such as in the southwestern United States in summer, most monthly precipitation falls within a few days. In less variable climates (low $ \widetilde{\sigma }_{P} $), precipitation is more evenly distributed. TEMPGEN generates realistic amounts and seasonal patterns of monthly precipitation. The average daily amount of precipitation is also accurately simulated. However, the daily probability of precipitation generated by the model is slightly lower than observed (by about 2%). The model generates fairly accurate drought durations, except in the driest climates where it underestimates extreme drought durations. The extreme daily precipitation events generated by the model are unbiased, but their correlation with observations is low (r = 0.0677). It must be noted, however, that these extreme precipitation events are rare and that observations used for comparison with model output were based on only 30 years of data per station.

Many authors have used daily weather generators as a means of down-scaling the outputs of Global Circulation Models (CGMs) for climate change impact studies (Oelschlägel 1995; Bürger 1997; Dubrovský 1997; Semenov and Barrow 1997). This involves summarizing CGM outputs and generating daily weather from the resulting distributions (Qian et al. 2005), or altering monthly statistics by fixed amounts (Mearns et al. 1997). One of the problems with these approaches is that CGM outputs often do not reflect many of the statistical characteristics of actual daily weather fluctuations, especially precipitation patterns at regional scales. This is a particularly problematic issue with respect to dry-wet day models, as discussed by Katz (1996). Many daily weather generators require calibration for use at specific locations (Soltani and Hoogenboom 2003; Fowler et al. 2005). The TEMPGEN algorithm has the main advantage of requiring no input data other than the 11 sets of station-specific monthly normals. It can be used to generate daily minimum and maximum temperature and precipitation time series for any location in Canada or the continental United States when provided with local climate normals (whether actual or as a synthesis of CGM output). To provide such normals for locations not close to a weather station or at a finer spatial scale than CGM output, several spatial disaggregation methods are available, such as application of regional climate models or spatial interpolation on the basis of a network of weather station records and topography (Hutchinson 1991, 1995b; Gignac 2000; Price et al. 2000).

References

Bruhn JA (1980) A stochastic model for the simulation of daily weather. Prot Ecol 2:199–208
Google Scholar
Bürger G (1997) On the disaggregation of climatological means and anomalies. Clim Res 8:183–194
Google Scholar
Chin EH (1977) Modelling daily precipitation occurrence process with Markov chain. Water Resour Res 13:949–956
Google Scholar
Dubrovský M (1997) Creating daily weather series with use of the weather generator. Environmetrics 8:409–424
Article Google Scholar
Dubrovský M, Buchtele J, Žalud Z (2004) High-frequency and low-frequency variability in stochastic daily weather generator and its effect on agricultural and hydrologic modelling. Clim Change 63:145–179
Article Google Scholar
Fowler HJ, Kilsby CG, O’Connell PE, Burton A (2005) A weather-type conditioned multi-site stochastic rainfall model for the generation of scenarios of climatic variability and change. J Hydrol 308:50–66
Article Google Scholar
Friend AD (1998) Parameterisation of a global daily weather generator for terrestrial ecosystem modelling. Ecol Model 109:121–140
Article Google Scholar
Comparaison de la régression spatiale et du krigeage avec dérive pour interpoler les extrants de modèles de simulation de développement d’insectes au Québec en fonction de l’échelle, de la topographie et de l’influence maritime. M.Sc. Thesis, Faculté de foresterie et géomatique, Université Laval, Québec, Québec, Canada
Gray DR (2004) The gypsy moth life stage model: landscape-wide estimates of gypsy moth establishment using a multi-generational phenology model. Ecol Model 176:155–171
Article Google Scholar
Hansen JW, Ines AVM (2005) Stochastic disaggregation of monthly rainfall data for crop simulation studies. Agric For Meteorol 131:233–246
Article Google Scholar
Hansen JW, Mavromatis T (2001) Correcting low-frequency variability bias in stochastic weather generators. Agric For Meteorol 109:297–310
Article Google Scholar
Hastings NAJ, Peacock JB (1975) Statistical distributions: A handbook for students and practitioners. Butterworth, London
Google Scholar
Hutchinson MF (1991) The application of thin-plate smoothing splines to continent-wide data assimilation. In: Jasper JD (ed) Data assimilation systems. BMRC Res Rep No. 27, Melbourne, pp 104–113
Google Scholar
Hutchinson MF (1995a) Stochastic space-time weather models from ground-based data. Agric For Meteorol 73:237–264
Article Google Scholar
Hutchinson MF (1995b) Interpolating mean rainfall using thin-plate smoothing splines. Int J GIS 9:385–403
Google Scholar
Johnson GL, Hanson CL, Hardegree SP, Ballard EB (1996) Stochastic weather simulation: overview and analysis of two commonly used models. J Appl Meteorol 35:1878–1896
Article Google Scholar
Jones PG, Thornton PK (2000) MarkSim: software to generate daily weather data for Latin America and Africa. Agron J 92:445–453
Article Google Scholar
Katz RW (1996) The use of stochastic models to generate climate change scenarios. Clim Change 32:237–255
Article Google Scholar
Kimball JS, Thornton PE, White MA, Running SW (1997) Simulating forest productivity and surface-atmosphere carbon exchange in the BOREAS study region. Tree Physiol 17:589–599
PubMed CAS Google Scholar
Lasch P, Lindner M, Erhard M, Suckow F, Wenzel A (2002) Regional impact assessment on forest structure and functions under climate change - the Brandenburg case study. For Ecol Manag 162:73–86
Article Google Scholar
Latif M, Barnett TP (1994) Causes of decadal climate variability over the North Pacific and North America. Science 266:634–637
Article CAS PubMed Google Scholar
Logan JA, Régnière J, Powell JA (2003) Assessing the impacts of global warming on forest pest dynamics. Front Ecol Environ 1:130–137
Article Google Scholar
Mavromatis T, Hansen JW (2001) Interannual variability characteristics and simulated crop response of four stochastic weather generators. Agric For Meteorol 109:283–296
Article Google Scholar
Mearns LO, Rosenzweig C, Goldberg R (1997) Mean and variance change in climate scenarios: methods, agricultural applications, and measures of uncertainty. Clim Change 35:367–396
Article Google Scholar
Oelschlägel B (1995) A method for downscaling global climate model calculations by a statistical weather generator. Ecol Model 82:199–204
Article Google Scholar
Philander SGH (1990) El Niño, La Niña and the Southern Oscillation. Academic, San Diego CA
Google Scholar
Prentice IC, Sykes MT, Cramer W (1993) A simulation model for the transient effects of climate change on forest landscapes. Ecol Model 65:51–70
Article Google Scholar
Price DT, McKenney DW, Nalder IA, Hutchinson MF, Kesteven JL (2000) A comparison of two statistical methods for spatial interpolation of Canadian monthly mean climate data. Agric For Meteorol 101:81–94
Article Google Scholar
Qian B, Hayhoe H, Gameda S (2005) Evaluation of the stochastic weather generators LARS-WG and AAFC-WG for climate change impact studies. Clim Res 29:3–21
Google Scholar
Racsko P, Szeidl L, Semonov M (1991) A serial approach to local stochastic weather models. Ecol Model 57:27–41
Article Google Scholar
Régnière J (1996) A generalized approach to landscape-wide seasonal forecasting with temperature-driven simulation models. Environ Entomol 25:869–881
Google Scholar
Régnière J, Bolstad P (1994) Statistical simulation of daily air temperature patterns in eastern North America to forecast seasonal events in insect pest management. Environ Entomol 23:1368–1380
Google Scholar
Régnière J, Nealis V (2002) Modelling seasonality of gypsy moth, Lymantria dispar (Lepidoptera: Lymantriidae), to evaluate probability of its persistence in novel environments. Can Entomol 134:805–824
Article Google Scholar
Richardon CW (1981) Stochastic simulation of daily precipitation, temperature and solar radiation. Water Resour Res 17:182–190
Article Google Scholar
Richardson CW, Wright DA (1984) WGEN: A model for generating daily weather variables. United States Department of Agriculture, Washington, ARS–8
Semenov MA, Barrow EM (1997) Use of a stochastic weather generator in the development of climate change scenarios. Clim Change 35:397–414
Article Google Scholar
Semenov MA, Brooks RJ, Barrow EM, Richardson CW (1998) Comparison of the WGEN and LARS-WG stochastic weather generators for diverse climates. Clim Res 10:95–107
Google Scholar
Skiles JW, Richardson CW (1998) A stochastic weather generation model for Alaska. Ecol Model 110:211–232
Article Google Scholar
Soltani A, Hoogenboom G (2003) Minimum data requirements for parameter estimation of stochastic weather generators. Clim Res 25:109–119
Google Scholar
Wallis TWR, Griffiths JF (1995) An assessment of the weather generator (WXGEN) used in the erosion /productivity impact calculator (EPIC). Agric For Meteorol 73:115–133
Article Google Scholar
Watanabe M, Nitta T (1999) Decadal changes in the atmospheric circulation and associated surface climate variations in the Northern Hemisphere winter. J Clim 12:494–510
Article Google Scholar
Wilks DS (1999) Simultaneous stochastic simulation of daily precipitation, temperature and solar radiation at multiple sites in complex terrain. Agric For Meteorol 96:85–101
Article Google Scholar
Yu B (2003) An assessment of uncalibrated CLIGEN in Australia. Agric For Meteorol 119:131–148
Article Google Scholar

Download references

Acknowledgements

Thanks to M. Lindner for ideas in the development of this model. Thanks also to D. Houle, D.W. McKenney and two anonymous reviewers for very helpful comments on an earlier draft. This work was funded in part by members of SERG-International, including Forest Protection Limited, the Quebec SOPFIM, the Province of Ontario, and the USDA Forest Service.

Author information

Authors and Affiliations

Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Centre, 1055 PEPS Street, P.O. Box 10380, Stn. Sainte-Foy, Quebec, QC, G1V 4C7, Canada
Jacques Régnière & Rémi St-Amant

Authors

Jacques Régnière
View author publications
You can also search for this author in PubMed Google Scholar
Rémi St-Amant
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jacques Régnière.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Régnière, J., St-Amant, R. Stochastic simulation of daily air temperature and precipitation from monthly normals in North America north of Mexico. Int J Biometeorol 51, 415–430 (2007). https://doi.org/10.1007/s00484-006-0078-z

Download citation

Received: 24 May 2006
Revised: 01 November 2006
Accepted: 05 December 2006
Published: 16 January 2007
Issue Date: May 2007
DOI: https://doi.org/10.1007/s00484-006-0078-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Stochastic simulation of daily air temperature and precipitation from monthly normals in North America north of Mexico

Abstract

Similar content being viewed by others

Stochastic investigation of daily air temperature extremes from a global ground station network

Estimating daily climatological normals in a changing climate

Stochastic generators of multi-site daily temperature: comparison of performances in various applications

Introduction