A stochastic model for the analysis of maximum daily temperature

Sirangelo, B.; Caloiero, T.; Coscarelli, R.; Ferrari, E.

doi:10.1007/s00704-016-1879-6

A stochastic model for the analysis of maximum daily temperature

Original Paper
Published: 05 August 2016

Volume 130, pages 275–289, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Theoretical and Applied Climatology Aims and scope Submit manuscript

A stochastic model for the analysis of maximum daily temperature

Download PDF

B. Sirangelo¹,
T. Caloiero²,
R. Coscarelli ORCID: orcid.org/0000-0002-2779-1100³ &
…
E. Ferrari⁴

339 Accesses
11 Citations
Explore all metrics

Abstract

In this paper, a stochastic model for the analysis of the daily maximum temperature is proposed. First, a deseasonalization procedure based on the truncated Fourier expansion is adopted. Then, the Johnson transformation functions were applied for the data normalization. Finally, the fractionally autoregressive integrated moving average model was used to reproduce both short- and long-memory behavior of the temperature series. The model was applied to the data of the Cosenza gauge (Calabria region) and verified on other four gauges of southern Italy. Through a Monte Carlo simulation procedure based on the proposed model, 10⁵ years of daily maximum temperature have been generated. Among the possible applications of the model, the occurrence probabilities of the annual maximum values have been evaluated. Moreover, the procedure was applied for the estimation of the return periods of long sequences of days with maximum temperature above prefixed thresholds.

A combined stochastic analysis of mean daily temperature and diurnal temperature range

Article 09 March 2018

Exploring the financial risk of a temperature index: a fractional integrated approach

Article 25 September 2018

Application of non-Gaussian multidimensional autoregressive model for climate data prediction

Article Open access 18 August 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Nowadays, investigation on air temperature has achieved relevant importance because of its influence on all natural systems and human activities, such as crop growth (Verdoodt et al. 2004; Bechini et al. 2006), agro-ecological zoning (Caldiz et al. 2001; Ye et al. 2008), and food security assessment (Ye and Van Ranst 2009, 2002; Ye et al. 2013). Moreover, high temperatures can cause an increase of death rates, especially when the data rise above critical values (Kunst et al. 1993; Curriero et al. 2002; Hajat et al. 2002; Keellings and Waylen 2012).

Stochastic modeling and simulation of daily meteorological data is a prominent subject in literature since several decades. A classical approach for the analysis of temperature data, based on the procedures presented by Yevjevich (1972), was proposed by Richardson (1981), who considered maximum and minimum temperatures as continuous multivariate stochastic processes. Generally, the stochastic models, used to reproduce climatic and hydrological series, are the autoregressive (AR), the moving average (MA), the autoregressive moving average (ARMA), and the autoregressive integrated moving average (ARIMA) models (Box and Jenkins 1976; Grimaldi et al. 2005). A limitation of these models is that they can only capture the short-range dependence, thus presenting a lack of flexibility in reproducing the combined effect of short- and long-memory (Box and Jenkins 1976). Since Hurst (1951) detected the presence of long-term persistence in data series studying the Nile River levels, the need of long memory in time series modeling has been pointed out. In fact, long-range dependence has been encountered in various hydrological data (Lye and Lin 1994; Pelletier and Turcotte 1997; Koscielny-Bunde et al. 2006; Ehsanzadeh and Adamowski 2010). Doukhan et al. (2003) evidenced that long-range dependent processes are characterized by a hyperbolic decrease of the autocorrelation function and are closely related to self-similarity. Recently, Prass et al. (2012) found that long-range dependence may affect the performance of time series models with a short time step. Moreover, incorporating long-range dependence into time series modeling is also conceptually important, since the model should capture the behavior of the data as realistically as possible. To this aim, Granger and Joyeux (1980) and Hosking (1981) proposed the fractionally differenced ARIMA models (FARIMA or ARFIMA) as an extension of the ARIMA models. The differencing order of the FARIMA models can be fractional, thus providing flexibility in that they can capture both short- and long-memory behavior, by varying the autoregressive and moving average components and using few parameters (Lohre et al. 2003; Montanari et al. 1997; Prass et al. 2012). This is an advantage of the FARIMA modeling framework because other long-memory models, capable of reproducing the Hurst phenomenon, such as the fractional Gaussian noise, have no flexibility in the choice of the short-memory autocorrelation structure (Koutsoyiannis 2002). In fact, it is shown that the fractionally integrated time series models are much more accurate than the traditional autoregressive models employing a similar number of parameters (Caballero et al. 2002).

Many studies have been developed on FARIMA models in literature. Bisaglia and Grigoletto (2001) proposed a bootstrap-based method to construct prediction intervals for FARIMA processes. Rupasinghe and Samaranayake (2012) and Rupasinghe et al. (2014) introduced a simpler alternative method, based on the sieve bootstrap approach of Alonso et al. (2002) and Alonso et al. (2003). Several contributions have been proposed in hydrology (e.g., Hosking 1984), climatology (e.g., Baillie and Chung 2002), and temperature (e.g., Smith 1993). As regards in particular hydrological studies, Montanari et al. (1997) proposed a FARIMA model for the analysis of monthly and daily inflows of Lake Maggiore (Italy). Montanari et al. (2000) applied a special form of the generalized FARIMA process to the Nile River monthly flows at Aswan. They combined the generalized FARIMA approach with a multiplicative ARIMA approach, which allowed to model seasonal and non-seasonal long and short memory. Sheng and Chen (2011) developed a new model, based on FARIMA with stable innovations, to analyze the data and predict the future elevation levels of Great Salt Lake. Yang and Bowling (2014) used a FARIMA model to estimate the long memory in daily stream flow for basins in the Upper Great Lakes region.

In this paper, a stochastic model which adopts the FARIMA approach is proposed for the analysis of daily temperature. Specifically, the different steps of the proposed model concern the data deseasonalization, by means of a truncated Fourier series expansion; the data normalization, through the transformations introduced by Johnson (1949); and the analysis of the correlation structure through a FARIMA model. The model was applied to a maximum temperature series and verified on other four southern Italian gauges. As an example of the possible model applications, some features of the maximum daily temperature have been evaluated by means of a Monte Carlo simulation procedure.

2 Stochastic modeling of daily temperature

Let us indicate with i = 0 , 1 , 2 , . . . the days from a generic starting point, and let us define T(i) as the temperature characterizing the generic i-day. The T(i) values can be the maximum, T _max(i), or the minimum, T _min(i), or even the mean temperature, T _mean(i), of the day. A sequence of T(i) observations can be explained as a realization of a discrete parameter stochastic process which shows a cyclostationarity condition, with period D equal to a year (D = 365.25 days).

2.1 Deseasonalization

The temperature is not a stationary variable (Jewson and Caballero 2003; Campbell and Diebold 2005), where the non-stationarity can be caused by a deterministic seasonal component or by a monotonic trend. Assuming the trend component equal to zero, the T(i) process can be reduced to a weakly stationary standardized process, Y(i), through the transformation (Grimaldi 2004; Montanari et al. 1997, 2000; Prass et al. 2012)

$$ Y(i)=\frac{T(i)-{\mu}_T(i)}{\sigma_T(i)} $$

(1)

where μ _T(i) and σ _T(i) are the mean and the standard deviation functions of the T(i) process, respectively. The mean and the variance functions can be described by means of the truncated Fourier series

$$ {\mu}_T(i)=\frac{1}{2}{a}_{\mu, 0}+{\displaystyle \sum_{j=1}^{n_{h,\mu }}\left[{a}_{\mu, j} \cos \left(\frac{2\pi\;j}{D}i\right)+{b}_{\mu, j} \sin \left(\frac{2\pi\;j}{D}i\right)\right]} $$

(2)

$$ {\sigma}_T^2(i)=\frac{1}{2}{a}_{\sigma^2,0}+{\displaystyle \sum_{j=1}^{n_{h,{\sigma}^2}}\left[{a}_{\sigma^2,j} \cos \left(\frac{2\pi\;j}{D}i\right)+{b}_{\sigma^2,j} \sin \left(\frac{2\pi\;j}{D}i\right)\right]} $$

(3)

in which n _h , μ and $ {n}_{h,{\sigma}^2} $ are the number of harmonics, while a _μ , 0 , a _μ , j , b _μ , j and $ {a}_{\sigma^2,0},{a}_{\sigma^2,j},{b}_{\sigma^2,j} $ are the coefficients of the Fourier expansion of the mean and the variance functions, respectively.

Given a sample of observed temperature, t(i _k), with k = 1 , 2 , . . . K which corresponds to the days i ₁ , i ₂ , . . . , i _k, the functions μ _T(i) and $ {\sigma}_T^2(i) $ can be estimated through their analogous sample values, m _T(i) and $ {s}_T^2(i) $.

By using the least squares method, the Fourier coefficients of the mean function can be estimated by minimizing the following function:

$$ {S}_{\mu}^2\left({a}_{\mu, 0},{a}_{\mu, 1},{b}_{\mu, 1},\dots \right)={\displaystyle \sum_{k=1}^K{\left\{t\left({i}_k\right)-\frac{1}{2}{a}_{\mu, 0}-{\displaystyle \sum_{j=1}^{n_{h,\mu }}{f}_{\mu}\left({i}_kj;{a}_{\mu, j},{b}_{\mu, j}\right)}\right\}}^2} $$

(4)

where f _μ(i _k j; a _μ , j, b _μ , j) = a _μ , j cos (2π i _k j/D) + b _μ , j sin (2π i _k j/D).

Analogously, the coefficients of the Fourier expansion of the variance function can be estimated using the same sample, by minimizing the following function:

$$ {S}_{\sigma^2}^2\left({a}_{\sigma^2,0},{a}_{\sigma^2,1},{b}_{\sigma^2,1},\cdots \right)={\displaystyle \sum_{k=1}^K{\left\{{\left[t\left({i}_k\right)-{\mu}_T\left({i}_k\right)\right]}^2-\frac{1}{2}{a}_{\sigma^2,0}-{\displaystyle \sum_{j=1}^{{}^nh,{\sigma}^2}{f}_{\sigma^2}\left({i}_kj;{a}_{\sigma^2,j},{b}_{\sigma^2,j}\right)}\right\}}^2} $$

(5)

where $ {f}_{\sigma^2}\left({i}_kj;{a}_{\sigma^2,j},{b}_{\sigma^2,j}\right)={a}_{\sigma^2,j} \cos \left(2\pi\;{i}_kj/D\right)+{b}_{\sigma^2,j} \sin \left(2\pi\;{i}_kj/D\right) $.

If the temporal span of the sample is a multiple of the period D, and the series does not have any missing days, the trigonometric interpolation theory provides the estimation of the coefficients in explicit form (Prass et al. 2012). The number of harmonics ($ {\widehat{n}}_{h,\mu } $ and $ {\widehat{n}}_{h,{\sigma}^2} $) must be estimated by assuring the absence of the periodicity of the process Y(i), with respect to the criterion of the parameter parsimony. The Y(i) data obtained through the deseasonalization of t(i _k) are generally correlated, but the tests employed for the estimation of the number of harmonics require a random sample. Therefore, for each prefixed couple n _h , μ and $ {n}_{h,{\sigma}^2} $, a transformed random subsample of y(i _k), namely, $ y\left({i}_{k_{*}}\right)=\left[t\left({i}_{k*}\right)-{\mu}_T\left({i}_{k*}\right)\right]/{\sigma}_T\left({i}_{k*}\right) $ with k _* = 1 , 2 , . . . , K _*, can be created. This subsample can be obtained by extracting values sampled every δ days (temporal span), with δ long enough to limit the stochastic dependence effect, however assuring a reliable sample length. The subsample is subdivided into M classes, assigning to each generic m ^th class the N _m values of $ y\left({i}_{k_{*}}\right) $ for which it holds

$$ \left(m-1\right)/M\le \left({i}_k \mod D\right)/D<m/M\kern0.5em m=1,2,\ldots M $$

(6)

Indicated as μ _Y , m and $ {\sigma}_{Y,m}^2 $ the mean and the variance values of each class, it is possible to test the hypotheses H _0 , μ : μ _Y , 1 = μ _Y , 2 = . . . = μ _Y , M = μ _Y and $ {H}_{0,{\sigma}^2}:{\sigma}_{Y,1}^2={\sigma}_{Y,2}^2=\ldots ={\sigma}_{Y,M}^2={\sigma}_Y^2 $. The hypothesis H _0 , μ is tested through the statistics $ {S}_V^2 $, approximately distributed according to a Fisher variance-ratio law v ²(f ₁, f ₂). The hypothesis $ {H}_{0,{\sigma}^2} $ is tested through the Barlett’s test (Snedecor and Cochran 1989) based on the statistics $ {S}_B^2 $ approximately distributed according to a χ ²(f ₁) law. Both the procedures are detailed in Appendix 1.

2.2 Gaussianization procedure

The Gaussianization procedure (e.g., Chen and Gopinath 2000; Hólm et al. 2002; Servidio et al. 2011) is a necessary condition to respect the coherence of the linear stochastic model. In this case, the sample values y(i _k) of the random variable Y have a null mean value, m _Y = 0, and a unit variance, $ {s}_Y^2=1 $, but generally show skewness (g _1 , Y) and kurtosis (g _2 , Y) coefficients, which significantly differ from the theoretical values expected for a Gaussian variable (γ _1 , Y = 0 and γ _2 , Y = 3, respectively). In this case, it is possible to transform the original variable, Y, into a standardized Gaussian variable, Z = f(Y). For this purpose, the transformation functions introduced by Johnson (1949) are well suited,

$$ Z=\eta +\theta \ln \left[{f}_Y^{(.)}\left(y;\alpha; \beta \right)\right] $$

(7)

where − ∞ < η < + ∞, θ > 0, − ∞ < α < + ∞, and β > 0 are the parameters of the transformation and $ {f}_Y^{\left(\cdot \right)}\left(y;\alpha, \beta \right) $ can take one of the following forms:

$$ \begin{array}{cc}\hfill {f}_Y^{(U)}\left(y;\alpha, \beta \right)=\left(y-\alpha \right)/\beta +\sqrt{1+{\left(y-\alpha \right)}^2/{\beta}^2}\hfill & \hfill -\infty <y<+\infty \hfill \end{array} $$

(8)

$$ \begin{array}{cc}\hfill {f}_Y^{(L)}\left(y;\alpha, \beta \right)=\left(y-\alpha \right)/\beta \hfill & \hfill y>\alpha \hfill \end{array} $$

(9)

$$ \begin{array}{cc}\hfill {f}_Y^{(B)}\left(y;\alpha, \beta \right)=\left(y-\alpha \right)/\left(\alpha +\beta -y\right)\hfill & \hfill \alpha <y<\alpha +\beta \hfill \end{array} $$

(10)

Specifically, Eqs. (8) and (10) are known as unbounded and bounded Johnson transformations, respectively, while Eq. (9) implies that the random variable y is distributed according to a log-normal law with three parameters. The choice of the function $ {f}_Y^{\left(\cdot \right)}\left(y;\alpha, \beta \right) $ to be adopted depends on the sample values of the skewness coefficient, g _1 , Y, and the kurtosis coefficient, g _2 , Y. In fact, given that

$$ {R}_1=\sqrt[3]{1+{g}_{1,Y}^2/2+\sqrt{g_{1,Y}^2\left(1+{g}_{1,Y}^2/4\right)}} $$

(11)

$$ {R}_2=\sqrt[3]{1+{g}_{1,Y}^2/2-\sqrt{g_{1,Y}^2\left(1+{g}_{1,Y}^2/4\right)}} $$

(12)

$$ {G}_2=6\left({R}_1^2+{R}_2^2\right)+\left({g}_{1,Y}^2-8\right)\left({R}_1+{R}_2\right)-2{g}_{1,Y}^2+7 $$

(13)

Equation (8) has to be used if g _2 , Y > G ₂, Eq. (10) if g _2 , Y < G ₂, while Eq. (9) concerns only the special case g _2 , Y = G ₂.

The different techniques used for the estimation of the transformation parameters are based on the method of moments. In the proposed model, only the parameters η and θ have to be effectively estimated, since α and β are linked to the former through analytical expressions, being μ _Y = 0 and $ {\sigma}_Y^2=1 $.

If the unbounded Johnson transformation must be adopted, in order to estimate the parameters η and θ, the following equation in the variable ω = exp (0/θ ²) has to be numerically resolved (Tuenter 2001):

$$ \left[\omega -{\omega}_{*}\right]{\left[2\omega +{\omega}_{*}+3\right]}^{\;2}-4{g}_{1,Y}^2=0 $$

(14)

where

$$ {\omega}_{*}=2\left[\sqrt{1+\frac{1}{2}\left({\omega}^2-\frac{g_{2,Y}+3}{\omega^2+2\omega +3}\right)}-1\right] $$

(15)

Otherwise, if the bounded Johnson transformation must be applied, the parameter estimation is less simple. In fact, for the estimation of the parameters η and θ, the following non-linear system should be numerically resolved:

$$ \left\{\begin{array}{l}{\gamma}_{1,Y}\left(\eta, \theta \right)={g}_{1,Y}\hfill \\ {}{\gamma}_{2,Y}\left(\eta, \theta \right)={g}_{2,Y}\hfill \end{array}\right. $$

(16)

in which also the values of the functions γ _1 , Y(η, θ) and γ _2 , Y(η, θ) should be numerically evaluated.

2.3 Correlation structure

The sample data series z(i _k) of the random variable Z, obtained from the Johnson transformations applied to the sample y(i _k), usually shows a correlation structure characterized by a marked persistence, with values of autocorrelation coefficients slowly decreasing for growing lags. Assuming that the zero mean process z(i) can be described by a FARIMA (p,d,q) model, the following relationship holds:

$$ {\varPhi}_p(B){\left(1-B\right)}^dz(i)={\varPsi}_q(B)\varepsilon (i) $$

(17)

where B is the backward operator, Φ _p(B) is the p-order polynomial of the autoregressive component, Ψ _q(B) is the q-order polynomial of the mean average component, ε(i) is a sequence of i.i.d. random variables with mean zero, and d is the fractional order of differentiation. In other terms, the FARIMA (p,d,q) model can be considered as a composition between a fractional filter d and an ARMA (p,q) process

$$ {\left(1-B\right)}^dz(i)=u(i) $$

(18)

$$ {\varPhi}_p(B)u(i)={\varPsi}_q(B)\varepsilon (i) $$

(19)

Equation (19) shows that the intermediate process u(i) is an ARMA (p,q) process, which for Ψ ₀ = 1 is

$$ u(i)={\displaystyle \sum_{k_p=1}^p{\varphi}_{k_p}}u\left(i-{k}_p\right)+{\displaystyle \sum_{k_q=0}^q{\psi}_{k_q}}\varepsilon \left(i-{k}_q\right) $$

(20)

By employing the series expansion of (1 − B)^d, Eq. (18) can be described as

$$ u(i)=\varGamma \left(d+1\right){\displaystyle \sum_{s=0}^{\infty}\frac{{\left(-1\right)}^s}{\varGamma \left(d-s+1\right)s!}}z\left(i-s\right) $$

(21)

It is well known that the maximum likelihood estimation of the variance-covariance matrix of a multivariate normal distribution is the sample variance-covariance matrix (Anderson and Olkin 1985). By using a FARIMA model, the correlation structure depends on the model parameters. Thus, through a weighted least squares method, we find the best fitting of the maximum likelihood estimation of the variance-covariance matrix.

The estimation of the parameters d, $ {\varphi}_{k_p} $, and $ {\psi}_{k_q} $ can be obtained following a trial-and-error procedure. Through Eq. (21), for each assigned value of the parameter d, it is possible to transform the sample z(i _k) in a sample u(i _k), which can be considered as a realization of the ARMA (p,q) process. Thus, once the sample autocorrelation values r _U , λ(d) of u(i _k) for λ = 1 , . . . , p + q are evaluated, the estimated values $ {\widehat{\varphi}}_1(d),\ldots,{\widehat{\varphi}}_p(d) $, $ {\widehat{\psi}}_1(d),\dots, {\widehat{\psi}}_q(d) $ can be obtained, by considering

$$ \begin{array}{cc}\hfill {\rho}_{U,\lambda}\left[{\widehat{\varphi}}_1(d),\dots, {\widehat{\varphi}}_p(d),{\widehat{\psi}}_1(d),\dots, {\widehat{\psi}}_q(d)\right]={r}_{U,\lambda }(d)\hfill & \hfill \lambda =1,\dots, p+q\hfill \end{array} $$

(22)

where ρ _U , λ are the theoretical autocorrelation values of the ARMA (p,q) process.

The actual value $ \widehat{d} $ of d can be estimated by minimizing the weighted mean square deviation

$$ {S}_{\rho}^2={\displaystyle \sum_{l=1}^{N_r}{\omega}_l{\left\{{r}_{Z,l}-{\rho}_{Z,l}\left[{\widehat{\varphi}}_1(d),\dots, {\widehat{\varphi}}_p(d),{\widehat{\psi}}_1(d),\dots, {\widehat{\psi}}_q(d),d\right]\right\}}^2} $$

(23)

where r _Z , l are the sample correlogram values derived by the sample z(i _k); ρ _Z , l are the theoretical autocorrelation values of the FARIMA (p,d,q) model; N _r is the maximum lag to which the calculation for the deviation can be extended; and ω _l, with l = 1 , . . . , N _r and $ {\displaystyle {\sum}_{l=1}^{N_r}{\omega}_l}=1 $, are the weighting coefficients which allow to distribute the quality of the data fit at varying of the lag l.

The estimated values $ \widehat{p} $ and $ \widehat{q} $ must be fixed by respecting the principle of parametric parsimony (Box and Jenkins 1976). Akaike (1974) suggests a mathematical formulation of the parsimony criterion of model building, known as Akaike information criterion (AIC), for the purpose of selecting an optimal model fits to a given data. In this work, the Akaike information criterion in the correct form (AICc) has been used (Sugiura 1978; Burnham and Anderson 2002). The model showing the minimum value of the AICc can be considered as the one which better matches both the fitting to the observed data and the principle of parametric parsimony. Before applying the AIC criterion, it has been verified that the residuals were white noise through the application of the Anderson-Darling test (Anderson and Darling 1952).

3 Application

3.1 Database

The proposed stochastic model has been applied to the maximum daily temperatures, T(i) = T _max(i), of the Cosenza gauge and verified on other four southern Italian gauges, managed by the Centro Funzionale Multirischi of the Calabria region (Fig. 1). The gauges are located in an area characterized by high climatic variability due to its geographic location and mountainous nature (Coscarelli and Caloiero 2012). Summers are typically dry, denoting a subtropical Mediterranean climate. The coastal areas present mild winters and hot summers with little precipitation. The Ionian side, influenced by air masses coming from Africa, records high temperatures and intense precipitation; on the Tyrrhenian side, influenced by western air currents, milder temperatures and considerable orographic precipitation are observed. The internal areas are characterized by colder winters, sometimes snowy, and fresher summers with some precipitation (Ferrari et al. 2013; Buttafuoco et al. 2015; Caloiero et al. 2015a).

The main features of the selected gauges are presented in Table 1. The gauges are located at different altitudes, namely, from a few meters to about 800 m above sea level (a.s.l.). Moreover, considering an observation period spanning from 1951 to 2010, the temperature gauges have less than 9 % missing days. Previous regional studies on the temperature data series of Calabria region (Caloiero et al. 2015b, c) did not show significant trends in maximum temperature data for the gauges used in this study. As a further analysis, in this study, the Mann-Kendall test (Mann 1945; Kendall 1962) has been applied to investigate the existence of monotonic trend in maximum daily temperature, as suggested also by Prass et al. (2012). Results confirmed that no statistical significant trends exist for the selected gauges.

Table 1 Gauges used for the application

Full size table

3.2 Parameter estimation

The absence of trends in the data series considered in this study allows the use of Eqs. (2) and (3). Since the series of the maximum daily temperature, t _max(i _k), show some missing data, the estimation of the coefficients of the truncated Fourier expansion referred to the mean, μ _T(i), and to the variance functions, $ {\sigma}_T^2(i) $, has been performed by minimizing Eqs. (4) and (5).

With reference to the Cosenza gauge, Table 2 shows the results of the tests used for the identification of the minimum number of harmonics of the Fourier expansion, for which both the hypotheses $ {H}_{0,{\sigma}^2} $ and H _0 , μ cannot be rejected (significant level = 0.05). The tests employed data sampled every 8 days (δ = 8) and indicated that two harmonics are required for both the mean and the variance functions in order to remove the periodicity in the first- and second-order statistics of the observed series. The results of the same tests applied also to the other gauges confirm that two harmonics are needed for all the stations to remove the periodicity in the mean function (Table 3). Moreover, differently from the Cosenza gauge, for three series, the periodicity of the variance function can be removed with only one harmonic. In Table 3, for each station, the estimated values of the coefficients of the truncated Fourier expansion for the mean and the variance functions are also reported. Figure 2 shows, for the Cosenza gauge, the comparisons between sample and estimated values of the mean and the standard deviation functions, respectively. In particular, the estimated values have been obtained for zero, one, and two harmonics for each day of the year. The same comparisons are shown for the Catanzaro gauge in Fig. 3. In Fig. 4, the comparison among the ACFs of the observed and the deseasonalized values (for a lag range centered to 365 days) is shown.

Table 2 Identification of the number of harmonics for the Cosenza gauge

Full size table

Table 3 Coefficients of the truncated Fourier expansions for the mean and the variance functions

Full size table

The Gaussianization procedure applied to the deseasonalized data series of Cosenza, $ y\left({i}_k\right)=\left[{t}_{\max}\left({i}_k\right)-{\widehat{\mu}}_T\left({i}_k\right)\right]/{\widehat{\sigma}}_T\left({i}_k\right) $, was performed through the unbounded version of the Johnson transformations. Operatively, the parameters η and θ of the Johnson transformation have been estimated by numerically solving Eq. (14), thus allowing also the estimation of the derived parameters α and β (Table 4). The application to the other gauges evidenced that only the data series of the Potenza gauge has been transformed into a Gaussian process by means of the bounded function, thus requiring the numerical solution of the non-linear system (Eq. (16)). For all the gauges, Table 4 presents the sample values of the skewness and the kurtosis coefficients of the y(i _k) series (g _1 , Y, g _2 , Y) and the set of the estimated values of the Johnson transformation parameters $ \left(\widehat{\alpha},\widehat{\beta},\widehat{\eta},\widehat{\theta}\right) $.

Table 4 Parameter estimation of the Johnson transformation

Full size table

The achievement of the Gaussian feature of the series z(i _k) is evidenced by the comparison of the sampling values with the theoretical values of the standard normal variable on probabilistic plot, as shown for the Cosenza and the Villapiana gauges in Fig. 5.

The identification of the FARIMA (p,d,q) process aimed at describing the correlation structure of the Gaussian series. Specifically, the procedure, based on the AICc index, identified the orders $ \widehat{p} $ and $ \widehat{q} $, which better fitted the observed data and preserved the parametrical parsimony criterion. The results of this procedure for the Cosenza gauge are presented in Table 5. Namely, for increasing values of the orders p and q, the differential fractional order, d, was identified by minimizing the weighted mean square function, $ {S}_{\rho}^2 $ (Eq. (23)). In this equation, a maximum lag N _r = 50 has been fixed, and the weighting coefficients, $ {\omega}_j={\omega}_j^{\ast }/{\displaystyle \sum {\omega}_j^{\ast }} $, j = 1 , 2 , . . . , N _r with $ {\omega}_j^{\ast }=1-{\left[\left(j-1\right)/{N}_r\right]}^c $ and c = 1/2, have been adopted. The final results for all the gauges are summarized in Table 6. Globally, for three stations, a FARIMA (1, $ \widehat{d} $,0) model was identified, while for the other two stations, a FARIMA (1, $ \widehat{d} $,1) model was detected. The comparisons between sampling and theoretical correlograms evidenced the ability of the proposed FARIMA model to reproduce the long-term memory (Fig. 6).

Table 5 Identification of the FARIMA model for the Cosenza gauge

Full size table

Table 6 Identification of the FARIMA models for the all gauges used in the application

Full size table

3.3 Analysis of maximum daily temperatures

The proposed model, through the application of a Monte Carlo simulation procedure (Appendix 2), can be useful for assessing various features of the temperature database. Specifically, the synthetic world simulation concerned 10⁵ years, corresponding to a total generation of about L _s = 4 × 10⁷ values.

In this study, first, the proposed model has permitted the estimation of the probability F _W(τ)[w(τ)] = P[W(τ) ≤ w(τ)] of the annual maximum of the sequences of days, W(τ), with maximum daily temperature over a threshold, τ. Figure 7 shows, on a probabilistic Gumbel graph, the return periods T _W(τ) of the random variable W(τ), defined as T _W(τ) = 1/{1 − F _W(τ)[w(τ)]}, corresponding to different threshold values for the Cosenza station (altitude 242 m a.s.l.). In particular, for increasing threshold values, high return periods can be reached also for short-day sequences. As an example, considering the threshold τ = 40°, the values of W(τ) shift from 2 to 5 days for return periods ranging from 10 to 100 years. Differently, for higher-elevation gauges, such as Potenza (811 m a.s.l.), the same range of values is observed for a lower threshold value of temperature (about 35°). This behavior can be clearly connected to the influence of the gauge altitude on the variable W(τ).

As a further application, the probability values $ {p}_{I_{W\left(\tau \right)}}\left({i}_{W\left(\tau \right)}\right)=P\left[{I}_{W\left(\tau \right)}={i}_{W\left(\tau \right)}\right] $ that such exceedances start in a specified day, i _w, were also evaluated. The probability values $ {p}_{I_{W\left(\tau \right)}}\left({i}_{W\left(\tau \right)}\right) $ of the yearly temporal occurrence I _W(τ) of W(τ) are shown in Fig. 8 for various threshold values always for the Cosenza gauge. The maximum probability values have been detected in summer around the 210th day of the year.

Finally, the probability values F _W(κ)[w(κ)] = P[W(κ) ≤ w(κ)], associated to the annual maximum of sequences of days W(κ), with maximum daily temperature greater their expected value μ _T(i) plus κ-times the standard deviation, have been estimated. The return periods T _W(κ) of the variable W(κ), for different values of parameter κ, are presented in Fig. 9 for the Cosenza and the Crotone gauges. As a result, for fixed values of the return period and the parameter κ, the annual maxima values W(κ) were greater for the Crotone gauge than for the Cosenza gauge. In particular, this behavior was more evident for the lower threshold values. For example, the W(κ) values obtained for κ = 0, which corresponds to a return period T _W(κ) = 100 years, were 53 days for Cosenza and 80 days for Crotone, respectively.

4 Conclusion

In this paper, a stochastic model developed to simulate daily maximum temperature series, coherently with observed time series, is proposed. The model was based on three different steps. The first step was data deseasonalization, obtained by means of a truncated Fourier series expansion. Subsequently, a normalization technique was performed through the Johnson transformation. Finally, a FARIMA model was applied for the analysis of the correlation structure of the normalized data, characterized by a marked persistence. The procedure has been first applied to the Cosenza gauge and then tested to a set of maximum temperature series registered in four gauges located in southern Italy. The model satisfactorily reproduced the long-term memory of the temperature series, also allowing the parametric parsimony criterion.

Moreover, through the application of a Monte Carlo simulation procedure, the proposed model allowed the evaluation of various features of the temperature database. First, the empirical probability distribution of the annual maximum of the sequences of days, with maximum daily temperature over fixed thresholds, has been obtained. Results showed high return periods also for short-day sequences at increasing threshold values. Successively, the probability values that such sequences can start in a specified day have been also evaluated, showing that the highest occurrence probabilities fall in summer periods. Finally, the return period values associated to annual maximum of sequences of days, characterized by maximum daily temperature greater than their expected value plus κ-times the standard deviation, have been obtained.

An important criterion for stochastic modeling is the reproducibility of the statistical characteristics of observed data (Lee 2015). Effectively, the proposed model allowed the prediction of the statistical properties of temperatures, with few data required as input. Moreover, the stochastic model, not depending on station altitude and climatic zone, had the advantage of being applicable in a certain area, aiming to estimate occurrence probabilities and return periods associated with high-temperature events at any day of the year and/or at any gauge. This ability to extrapolate findings is particularly important when seeking to determine the risks of extremely rare events. For these reasons, the model is an attractive tool for management decision-making, and its basic structure can easily be applied to larger areas with spatially differentiated data.

References

Akaike H (1974) Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika 60:255–265
Article Google Scholar
Alonso AM, Peña D, Romo J (2002) Forecasting time series with sieve bootstrap. J Stat Plan Infer 100:1–11
Article Google Scholar
Alonso AM, Peña D, Romo J (2003) On sieve bootstrap prediction intervals. Statist Probab Lett 65:13–20
Article Google Scholar
Anderson TW, Darling DA (1952) Asymptotic theory of certain “goodness-of-fit” criteria based on stochastic processes. Ann Math Stat 23:193–212
Article Google Scholar
Anderson TW, Olkin I (1985) Maximum-likelihood estimation of the parameters of a multivariate normal distribution. Linear Algebra Appl 70:147–171
Article Google Scholar
Baillie RT, Chung S (2002) Modeling and forecasting from trend-stationary long memory models with applications to climatology. Int J Forecasting 18:215–226
Article Google Scholar
Bechini L, Bocchi S, Maggiore T, Confalonieri R (2006) Parameterization of a crop growth and development simulation model at sub-model components level. An example for winter wheat (Triticum aestivum L.). Environ Model Softw 21:1042–1054
Article Google Scholar
Bisaglia L, Grigoletto M (2001) Prediction intervals for FARIMA processes by bootstrap methods. J Stat Comput Simul 68:185–201
Article Google Scholar
Box GEP, Jenkins GM (1976) Time series analysis forecasting and control. Holden-Day
Burnham KP, Anderson DR (2002) Model selection and multimodel inference: a practical information-theoretic approach. Springer, New York
Google Scholar
Buttafuoco G, Caloiero T, Coscarelli R (2015) Analyses of drought events in Calabria (southern Italy) using standardized precipitation index. Water Resour Manag 29:557–573
Article Google Scholar
Caballero R, Jewson S, Brix A (2002) Long memory in surface air temperature: detection, modeling, and application to weather derivative valuation. Clim Res 21:127–140
Article Google Scholar
Caldiz DO, Gaspari FJ, Haverkort AJ, Struik PC (2001) Agro-ecological zoning and potential yield of single or double cropping of potato in Argentina. Agric For Meteorol 109:311–320
Article Google Scholar
Caloiero T, Coscarelli R, Ferrari E, Sirangelo B (2015a) Analysis of dry spells in southern Italy (Calabria). Water 7:3009–3023
Article Google Scholar
Caloiero T, Buttafuoco G, Coscarelli R, Ferrari E (2015b) Spatial and temporal characterization of climate at regional scale using homogeneous monthly precipitation and air temperature data: an application in Calabria (southern Italy). Hydrol Res 46:629–646
Article Google Scholar
Caloiero T, Callegari G, Cantasano N, Coletta V, Pellicone G, Veltri A (2015c) Bioclimatic analysis in a region of southern Italy (Calabria). Plant Biosystems, in press, doi:10.1080/11263504.2015.1037814
Campbell SD, Diebold FX (2005) Weather forecasting for weather derivatives. J Am Stat Assoc 100:6–16
Article Google Scholar
Chen SS, Gopinath RA (2000) Gaussianization. Adv Neural Comput Syst 13:423–429
Google Scholar
Coscarelli R, Caloiero T (2012) Analysis of daily and monthly rainfall concentration in southern Italy (Calabria region). J Hydrol 416–417:145–156
Article Google Scholar
Curriero FC, Heiner KS, Samet JM, Zeger SL, Strug L, Patz JA (2002) Temperature and mortality in 11 cities of the eastern United States. Am J Epidemiol 155:80–87
Article Google Scholar
Doukhan P, Oppenheim G, Taqqu MS (2003) Theory and application of long-range dependence. Birkhäuser, Boston
Google Scholar
Ehsanzadeh E, Adamowski K (2010) Trends in timing of low stream flows in Canada: impact of autocorrelation and long-term persistence. Hydrol Process 24:970–980
Article Google Scholar
Ferrari E, Caloiero T, Coscarelli R (2013) Influence of the North Atlantic oscillation on winter rainfall in Calabria (southern Italy). Theor Appl Climatol 114:479–494
Article Google Scholar
Granger CWJ, Joyeux R (1980) An introduction to long-range time series models and fractional differencing. J Time Ser Anal 1:15–30
Article Google Scholar
Grimaldi S (2004) Linear parametric models applied on daily hydrological series. J Hydrolog Eng 9:383–391
Article Google Scholar
Grimaldi S, Serinaldi F, Tallerini C (2005) Multivariate linear parametric models applied to daily rainfall time series. Adv Geosc 2:87–92
Article Google Scholar
Hajat S, Kovats RS, Atkinson RW, Haines A (2002) Impact of hot temperatures on death in London: a time series approach. J Epidemiol Community Health 56:367–372
Article Google Scholar
Hólm E, Andersson E, Beljaars A, Lopez P, Mahfouf JF, Simmons AJ, Thépaut JN (2002) Assimilation and modelling of the hydrological cycle: ECMWF’s status and plans. ECMWF Tech Memo 383, Reading
Hosking JRM (1981) Fractional differencing. Biometrika 68:165–176
Article Google Scholar
Hosking JRM (1984) Modeling persistence in hydrological time series using fractional differencing. Water Resour Res 20:1898–1908
Article Google Scholar
Hurst HE (1951) Long-term storage capacity of reservoirs. Trans Am Soc Civil Eng 116:770–799
Google Scholar
Jewson S, Caballero R (2003) Seasonality in the statistics of surface air temperature and the pricing of weather derivatives. Meteorol Appl 10:367–376
Article Google Scholar
Johnson NL (1949) Systems of frequency curves generated by methods of translation. Biometrika 36:149–176
Article Google Scholar
Keellings D, Waylen P (2012) The stochastic properties of high daily maximum temperatures applying crossing theory to modeling high-temperature event variables. Theor Appl Climatol 108:579–590
Article Google Scholar
Kendall MG (1962) Rank correlation methods. Hafner Publishing Company, New York
Google Scholar
Koscielny-Bunde E, Kantelhardt JW, Braun P, Bunde A, Havlin S (2006) Long-term persistence and multifractality of river runoff records: detrended fluctuation studies. J Hydrol 322:120–137
Article Google Scholar
Koutsoyiannis D (2002) The Hurst phenomenon and fractional Gaussian noise made easy. Hydrolog Sci J 47:573–595
Article Google Scholar
Kunst AE, Looman CWN, Mackenbach JP (1993) Outdoor air temperature and mortality in the Netherlands: a time-series analysis. Am J Epidemiol 137:331–341
Article Google Scholar
Lee T (2015) Stochastic simulation of precipitation data for preserving key statistics in their original domain and application to climate change analysis. Theor Appl Climatol. doi:10.1007/s00704-015-1395-0
Google Scholar
Lohre M, Sibbertsen P, Könning T (2003) Modeling water flow of the Rhine River using seasonal long memory. Water Resour Res 39:1132
Article Google Scholar
Lye LM, Lin Y (1994) Long-term dependence in annual peak flows of Canadian rivers. J Hydrol 160:89–103
Article Google Scholar
Mann HB (1945) Nonparametric tests against trend. Econometrica 13:245–259
Article Google Scholar
Montanari A, Rosso R, Taqqu MS (1997) Fractionally differenced ARIMA models applied to hydrologic time series: identification, estimation, and simulation. Water Resour Res 33:1035–1044
Article Google Scholar
Montanari A, Rosso R, Taqqu MS (2000) A seasonal fractional ARIMA model applied to the Nile River monthly flows at Aswan. Water Resour Res 36:1249–1259
Article Google Scholar
Pelletier JD, Turcotte DL (1997) Long-range persistence in climatological and hydrological time series: analysis, modeling and application to drought hazard assessment. J Hydrol 203:198–208
Article Google Scholar
Prass TS, Bravo JM, Clarke RT, Collischonn W, Lopes SRC (2012) Comparison of forecasts of mean monthly water level in the Paraguay River, Brazil, from two fractionally differenced models. Water Resour Res 48:W05502
Article Google Scholar
Richardson CW (1981) Stochastic simulation of daily precipitation, temperature, and solar radiation. Water Resour Res 17:182–190
Article Google Scholar
Rupasinghe M, Mukhopadhyayb P, Samaranayakec VA (2014) Obtaining prediction intervals for FARIMA processes using the sieve bootstrap. J Stat Comput Sim 84:2044–2058
Article Google Scholar
Rupasinghe M, Samaranayake VA (2012) Asymptotic properties of sieve bootstrap prediction intervals for FARIMA processes. Statist Probab Lett 82:2108–2114
Article Google Scholar
Servidio S, Greco A, Matthaeus WH, Osman KT, Dmitruk P (2011) Statistical association of discontinuities and reconnection in magnetohydrodynamic turbulence. J Geophys Res 116:A09102
Article Google Scholar
Sheng H, Chen YQ (2011) FARIMA with stable innovations model of Great Salt Lake elevation time series. Signal Process 91:553–561
Article Google Scholar
Smith RL (1993) Long-range dependence and global warming. In: Barnett V, Turkerman KF (eds) Statistics for the environment. Wiley, New York, pp. 141–146
Google Scholar
Snedecor GW, Cochran WG (1989) Statistical methods, 8th edn. Iowa State University Press, Iowa City
Google Scholar
Sugiura N (1978) Further analysis of the data by Akaike’s information criterion and the finite corrections. Commun Stat A-Theor 7:13–26
Article Google Scholar
Tuenter HJH (2001) An algorithm to determine the parameters of the S _U-curves in the Johnson system of probability distributions by moment matching. J Stat Comput Sim 70:325–347
Article Google Scholar
Verdoodt A, Van Ranst E, Ye L (2004) Daily simulation of potential dry matter production of annual field crops in tropical environments. Agron J 96:1739–1753
Article Google Scholar
Yang G, Bowling LC (2014) Detection of changes in hydrologic system memory associated with urbanization in the Great Lakes region. Water Resour Res 50:3750–3763
Article Google Scholar
Ye L, Tang H, Zhu J, Verdoodt A, Van Ranst E (2008) Spatial patterns and effects of soil organic carbon on grain productivity assessment in China. Soil Use Manage 24:80–91
Article Google Scholar
Ye L, Van Ranst E (2002) Population carrying capacity and sustainable agricultural use of land resources in Caoxian County (North China). J Sustain Agr 19:75–94
Article Google Scholar
Ye L, Van Ranst E (2009) Production scenarios and the effect of soil degradation on long-term food security in China. Global Environ Chang 19:464–481
Article Google Scholar
Ye L, Xiong W, Li Z, Yang P, Wu W, Yang G, Fu Y, Zou J, Chen Z, Van Ranst E, Tang H (2013) Climate change impact on China food security in 2050. Agron Sustain Dev 33:363–374
Article Google Scholar
Yevjevich V (1972) Structural analysis of hydrologic time series. Hydrol Pap 56, Colorado State University, Fort Collins (CO)

Download references

Acknowledgments

The authors thank the reviewer Salvatore Grimaldi for providing the constructive comments which have contributed to the improvement of the paper.

Author information

Authors and Affiliations

Department of Environmental and Chemical Engineering (DIATIC), University of Calabria, Rende (CS), Italy
B. Sirangelo
National Research Council of Italy, Institute for Agriculture and Forest Systems in the Mediterranean (CNR-ISAFOM), Rende (CS), Italy
T. Caloiero
National Research Council of Italy, Research Institute for Geo-hydrological Protection (CNR-IRPI), Rende (CS), Italy
R. Coscarelli
Department of Computer Engineering, Modeling, Electronics, and Systems Science (DIMES), University of Calabria, Rende (CS), Italy
E. Ferrari

Authors

B. Sirangelo
View author publications
You can also search for this author in PubMed Google Scholar
T. Caloiero
View author publications
You can also search for this author in PubMed Google Scholar
R. Coscarelli
View author publications
You can also search for this author in PubMed Google Scholar
E. Ferrari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. Coscarelli.

Appendices

Appendix 1: Estimation of the number of harmonics

The hypothesis, H _0 , μ : μ _Y , 1 = μ _Y , 2 = … = μ _Y , M = μ _Y, can be verified by using the statistics

$$ {S}_V^2={s^{\mathit{{\prime\prime}}}}_Y^2/{s^{\mathit{\prime}}}_Y^2 $$

(A1)

with

$$ {s^{\mathit{\prime}}}_Y^2=\frac{1}{K_{*}-M}{\displaystyle \sum_{m=1}^M\left({N}_m-1\right){s}_{Y,m}^2} $$

(A2)

$$ {s^{\mathit{{\prime\prime}}}}_Y^2=\frac{1}{M-1}{\displaystyle \sum_{m=1}^M{N}_m}{\left({m}_{Y,m}-{m}_Y\right)}^2 $$

(A3)

where $ {s}_{Y,m}^2 $ is the sample variance of the data referred to the m ^th class, m _Y , m is the sample mean of the data referred to the m ^th class, and m _Y is the mean of the whole sample $ y\left({i}_{k_{*}}\right) $. The statistics $ {S}_V^2 $ is approximately distributed according to a Fisher variance-ratio law v ²(f ₁, f ₂) with f ₁ = M − 1 and f ₂ = K _* − M degrees of freedom. For a significance level α _SL, the null hypothesis cannot be refused if $ {S}_V^2<{v}_{1-{\alpha}_{SL}}^2\left({f}_1,{f}_2\right) $, where $ {v}_{1-{\alpha}_{SL}}^2\left({f}_1,{f}_2\right) $ is the 1 − α _SL percentile of the ν² distribution.

The hypothesis $ {H}_{0,{\sigma}^2}\kern0.5em :\kern0.5em {\sigma}_{Y,1}^2={\sigma}_{Y,2}^2=\dots ={\sigma}_{Y,M}^2={\sigma}_Y^2 $ can be verified through the Bartlett’s test (Snedecor and Cochran 1989), based on the statistics

$$ {S}_B^2=\frac{1}{c_B}\left[\left({K}_{*}-M\right) \ln {s^{\mathit{\prime}}}_Y^2-{\displaystyle \sum_{m=1}^M\left({N}_m-1\right) \ln {s}_{Y,m}^2}\right] $$

(A4)

where

$$ {c}_B=1-\frac{1}{3\left(M-1\right)}\left(\frac{1}{K_{*}-M}-{\displaystyle \sum_{m=1}^M\frac{1}{N_m-1}}\right) $$

(A5)

The statistics $ {S}_B^2 $ is approximately distributed according to a χ ²(f ₁) law, with f ₁ = M − 1 degree of freedom. With a significance level equal to α _SL, the hypothesis cannot be rejected if $ {S}_B^2<{\chi}_{1-{\alpha}_{SL}}^2\left({f}_1\right) $, where $ {\chi}_{1-{\alpha}_{SL}}^2\left({f}_1\right) $ is the 1 − α _SL percentile of the χ ²distribution.

The smallest values of n _h , μ and $ {n}_{h,{\sigma}^2} $, for which both the hypotheses H _0 , μ and $ {H}_{0,{\sigma}^2} $ cannot be rejected, detect the number of harmonics $ {\widehat{n}}_{h,\mu } $ and $ {\widehat{n}}_{h,{\sigma}^2} $ to be used in the truncated Fourier expansion for the functions μ _T(i) and $ {\sigma}_T^2(i) $.

Appendix 2: Monte Carlo procedure

The Monte Carlo simulation procedure, used in this work to generate the daily maximum temperature t _max (i) series, can be schematized as follows:

1.
By using L’Ecuyer random generator, a sequence υ(i), with i = 1 , 2 , . . . , L _s, of random number uniformly distributed on the interval (0,1) is created.
2.
The sequence υ(i) is transformed into a sequence ε(i) of random numbers, distributed according to a normal law, with zero mean and variance $ {\sigma}_{\varepsilon}^2 $, through the Box and Müller technique.
3.
According to an $ \mathrm{ARMA}\;\left(\widehat{p},\widehat{q}\right) $ model, initialized with u(i) = 0 and ε(i) = 0 for i ≤ 0, a sequence of numbers is generated,
$$ \begin{array}{cc}\hfill u(i)={\displaystyle \sum_{k_p=1}^{\widehat{p}}{\widehat{\varphi}}_{k_p}}u\left(i-{k}_p\right)+{\displaystyle \sum_{k_q=0}^{\widehat{q}}{\widehat{\psi}}_{k_q}}\varepsilon \left(i-{k}_q\right)\hfill & \hfill i=1,2,\dots, {L}_s\hfill \end{array} $$
(B1)
4.
By using the series development of the $ {\left(1-B\right)}^{-\widehat{d}} $ operator, the sequence u(i) is transformed into a number sequence z(i) corresponding to the $ \mathrm{FARIMA}\kern0.5em \left(\widehat{p},\widehat{d},\widehat{q}\right) $ model with zero mean and unit variance
$$ \begin{array}{cc}\hfill z(i)=\frac{1}{\varGamma \left(\widehat{d}\right)}{\displaystyle \sum_{s=0}^{s_{\max }}\frac{\varGamma \left(\widehat{d}+s\right)}{s!}}u\left(i-s\right)\hfill & \hfill i={s}_{\max }+1,\dots, {L}_s\hfill \end{array} $$
(B2)

where

s _max is fixed so that $ \varGamma \left(\widehat{d}+{s}_{\max }+1\right)/\left({s}_{\max }+1\right)\kern0.5em !<\xi {\displaystyle {\sum}_{s=0}^{s_{\max }}\varGamma \left(\widehat{d}+s\right)/s!} $, with ξ = 10⁻⁴.

The value of the variance $ {\sigma}_{\varepsilon}^2 $ is fixed in order to obtain $ {\sigma}_Z^2=1 $. If a $ \mathrm{FARIMA}\kern0.5em \left(1,\widehat{d},0\right) $ model is employed, the value for $ {\sigma}_{\varepsilon}^2 $ is
$$ {\sigma}_{\varepsilon}^2=\frac{\varGamma^2\left(1-\widehat{d}\right)}{\varGamma \left(1-2\widehat{d}\right)}\cdot \frac{1+{\widehat{\varphi}}_1}{{}_2F_1\left(1,1+\widehat{d},1-\widehat{d};{\widehat{\varphi}}_1\right)} $$
(B3)

where Γ (.) and ₂ F ₁(.) indicate the complete gamma function and the hypergeometric function, respectively.
5.
The sequence z(i) = s _max + 1 , . . . , L _s is transformed into the sequence y(i) = s _max + 1 , . . . , L _s, by using the inverse function of the unbounded Johnson transformation
$$ \begin{array}{cc}\hfill y(i)=\widehat{\alpha}+\widehat{\beta} \sinh \left[\frac{z(i)-\widehat{\eta}}{\widehat{\theta}}\right]\hfill & \hfill i={s}_{\max }+1,\dots, {L}_s\hfill \end{array} $$
(B4)

or the inverse function of the bounded Johnson transformation
$$ \begin{array}{cc}\hfill y(i)=\frac{\widehat{\alpha}+\left(\widehat{\alpha}+\widehat{\beta}\right) \exp \left[\frac{z(i)-\widehat{\eta}}{\widehat{\theta}}\right]}{1+ \exp \left[\frac{z(i)-\widehat{\eta}}{\widehat{\theta}}\right]}\hfill & \hfill i={s}_{\max }+1,\dots, {L}_s\hfill \end{array} $$
(B5)
6.
The sequence of daily maximum temperature t _max(i) = s _max + 1 , . . . , L _s is obtained as
$$ \begin{array}{cc}\hfill {t}_{\max }(i)={\mu}_T(i)+{\sigma}_T(i)y(i)\hfill & \hfill i={s}_{\max }+1,\dots, {L}_s\hfill \end{array} $$
(B6)

where
$$ {\mu}_T(i)=\frac{1}{2}{\widehat{a}}_{\mu, 0}+{\displaystyle \sum_{j=1}^{n_{h,\mu }}\left[{\widehat{a}}_{\mu, j} \cos \left(\frac{2\pi \kern0.5em j}{D}i\right)+{\widehat{b}}_{\mu, j} \sin \left(\frac{2\pi \kern0.5em j}{D}i\right)\right]} $$
(B7)

$$ {\sigma}_T(i)={\left\{\frac{1}{2}{\widehat{a}}_{\sigma^2,0}+{\displaystyle \sum_{j=1}^{n_{h,{\sigma}^2}}\left[{\widehat{a}}_{\sigma^2,j} \cos \left(\frac{2\pi \kern0.5em j}{D}i\right)+{\widehat{b}}_{\sigma^2,j} \sin \left(\frac{2\pi \kern0.5em j}{D}i\right)\right]}\right\}}^{1/2} $$
(B8)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sirangelo, B., Caloiero, T., Coscarelli, R. et al. A stochastic model for the analysis of maximum daily temperature. Theor Appl Climatol 130, 275–289 (2017). https://doi.org/10.1007/s00704-016-1879-6

Download citation

Received: 25 November 2015
Accepted: 19 July 2016
Published: 05 August 2016
Issue Date: October 2017
DOI: https://doi.org/10.1007/s00704-016-1879-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A stochastic model for the analysis of maximum daily temperature

Abstract

Similar content being viewed by others

A combined stochastic analysis of mean daily temperature and diurnal temperature range

Exploring the financial risk of a temperature index: a fractional integrated approach

Application of non-Gaussian multidimensional autoregressive model for climate data prediction

1 Introduction