Keywords

12.1 Some Methods for Detecting Changes

The detection of changes in hydrological series plays an important role in the modern management and planning of water resources systems , since it enables a better understanding of the temporal behavior of the underlying hydrological phenomenon. The detection of trends, periodicities, change points or other deterministic components of the hydrological series weakens the assumption of stationarity of the random variable to which the series refers, which in turn, affects the applicability of the standard technical procedures most commonly used in Statistical Hydrology.

There are several kinds of tests and other methods for change detection, each having its own advantages and limitations. These methods should not be applied without considering the specific characteristics of the hydrological variable under analysis or without taking into account their underlying theoretical assumptions. It should be noted that some techniques presented elsewhere in this book are commonly used for the detection of change, such as the Spearman nonparametric test presented in Sect. 7.4.4. Another technique presented in a different chapter is simple linear regression: the methods introduced in Chap. 9, when applied to a hydrological random variable, using time as an independent variable, provide insight into the linear behavior of this variable over time, and thereby detect linear trends.

In this section, two methods for detecting changes are presented: the Mann–Kendall test for monotonic trend and the Pettitt test for change points. These methods, in addition to the above mentioned techniques which are presented elsewhere in this book, cover most of the common situations faced by researchers and practitioners in water resources engineering in cases related to the nonstationary behavior of hydrological variables.

12.1.1 Hypothesis Test for a Monotonic Trend

Originally proposed by Mann (1945) and further studied by Kendall (1975), the nonparametric Mann–Kendall test for trend constitutes a widely used approach for detecting monotonic trends in hydrologic time series. This test has the advantage of not assuming a particular form for the distribution of the observed data, while its performance is comparable to that of the hypothesis test of the linear regression slope parameter, presented in Sect. 9.2.4.

Consider the hydrologic series X t with \( t=1,2,\dots, N \). The Mann–Kendall test statistic is given by

$$ S={\displaystyle \sum_{i=1}^{N-1}{\displaystyle \sum_{j=i+1}^Nsgn\left({X}_j-{X}_i\right)}} $$
(12.1)

where the sign function \( \operatorname{sgn}(y)=1 \) if \( y>0 \), \( \operatorname{sgn}(y)=-1 \) if \( y<0 \), and \( \operatorname{sgn}(y)=0 \) if \( y=0 \). S has a null expected value \( E\left[S\right]=0 \) and its variance is given by

$$ Var\left[S\right]=\frac{1}{18}\left[N\left(N-1\right)\left(2N+5\right)-{\displaystyle \sum_{m=1}^M{t}_m\left({t}_m-1\right)\left(2{t}_m+5\right)}\right] $$
(12.2)

where M is the number of sets of tied groups and t m is the size of the mth tied group. The standardized test statistic Z, which follows a standard Normal distribution, is computed as

$$ Z=\left\{\begin{array}{cc}\hfill \frac{S-1}{\sqrt{Var\left[S\right]}}\hfill & \hfill, \kern0.5em S>0\hfill \\ {}\hfill 0\hfill & \hfill, \kern0.5em S=0\hfill \\ {}\hfill \frac{S+1}{\sqrt{Var\left[S\right]}}\hfill & \hfill, \kern0.5em S<0\hfill \end{array}\right. $$
(12.3)

The Mann–Kendall test has the null hypothesis H 0: {no trend in hydrologic series X t }. The region of rejection of the standardized test statistic, Z, is dependent on the specification of the significance level α, as well as on the formulation of the alternative hypothesis H 1. If H 1 is {increasing monotonic trend in X t }, it is a right-tailed hypothesis test, with H 0 being rejected if \( Z>{z}_{1-\alpha } \). If H 1 is {decreasing monotonic trend in X t }, it is a left-tailed hypothesis test, with H 0 being rejected if \( Z<{z}_{\alpha } \). Alternatively, if H 1 is {monotonic trend in X t } without specification of the sign of the trend, the hypothesis test is two-tailed and the decision is to reject H 0 if \( \left|Z\right|>{z}_{1-\alpha /2} \).

12.1.2 Hypothesis Test for a Change Point

The Pettitt test (Pettitt 1979) considers that a time series of a random variable X t with \( t=1,2,\dots, N \), has a change point at time step τ if the values of X t for \( t=1,2,\dots, \tau \) have the CDF F 1(x), and the values of X t for \( t=\tau +1,\tau +2,\dots, N \) have the CDF F 2(x) and \( {F}_1(x)\ne {F}_2(x) \). The application of the Pettitt test does not require the prior specification of the time step at which the change is supposed to have occurred.

The Pettitt test has the null hypothesis H 0: {no change point in hydrologic series X t }. The procedures to test the hypothesis are as follows. First, consider the matrix elements

$$ {D}_{i,j}=\mathrm{s}\mathrm{g}\mathrm{n}\left({X}_i-{X}_j\right) $$
(12.4)

where \( \mathrm{s}\mathrm{g}\mathrm{n}\left(\cdot \right) \) is the sign function. The sum of specific submatrices of the matrix D results in the following statistic:

$$ {U}_{t,N}={\displaystyle \sum_{i=1}^t{\displaystyle \sum_{j=t+1}^N{D}_{i,j}}} $$
(12.5)

The U t,N statistic should be computed for values of t ranging from 1 to N, which can be done using the following iterative formula (Rybski and Neumann 2011):

$$ {U}_{t,N}={U}_{t-1,N}+{\displaystyle \sum_{j=1}^N\mathrm{s}\mathrm{g}\mathrm{n}\left({X}_t-{X}_j\right)} $$
(12.6)

In order to test H 0 against alternative hypothesis H 1: {change point in X t }, the test is two-tailed and the Pettitt test statistic is given by

$$ {K}_N=\underset{1\le t<N}{ \max}\left|{U}_{t,N}\right| $$
(12.7)

It is also possible to make one-tailed tests of H 0 against alternative hypotheses H 1: {upward change in X t } and H 1: {downward change in X t },

$$ {K}_N^{-}=-\underset{1\le t<N}{ \min }{U}_{t,N} $$
(12.8)
$$ {K}_N^{+}=\underset{1\le t<N}{ \max }{U}_{t,N} $$
(12.9)

such that \( {K}_N= \max \left({K}_N^{-},{K}_N^{+}\right) \). Under H 0, \( E\left[{D}_{i,j}\right]=0 \) and the distribution of U t,N for each t is symmetrical with mean at zero. On one-tailed tests, a high value of \( {K}_N^{+} \) is expected when the series exhibits a downward change. In such a situation, the inequality \( {F}_1(x)\le {F}_2(x) \) is verified for at least some values of x. Analogously, high values of \( {K}_N^{-} \) are expected if the series shows an upward change, i.e., \( {F}_1(x)\ge {F}_2(x) \).

P-values of the Pettit test can be approximated by

$$ p\approx \exp \left(-\frac{6{K_N^{+}}^2}{N^3+{N}^2}\right) $$
(12.10)

for \( {K}_N^{+} \) or for \( {K}_N^{-} \), and

$$ p\approx 2 \exp \left(-\frac{6{K_N}^2}{N^3+{N}^2}\right) $$
(12.11)

for K N . These approximations hold for \( p\le 0.5 \).

The null hypothesis H 0 should be rejected if \( p<\alpha \), at the test significance level α. The supposed change to which the statistical test is applied occurs at the time step corresponding to the maximum (or minimum, depending on which test statistic is used) of U t,N .

Example 12.1

Consider the series of annual peak flows of the Lehigh River at Stoddartsville (Table 7.1). (a) Test the hypothesis of that series exhibiting a monotonic trend using the Mann–Kendall test. (b) Use the two-tailed Pettitt test to assess if that series shows any evidence of a change point. Consider the significance level \( \alpha =0.05 \).

Solution

  1. (a)

    The series has length \( N=73 \). Equations (12.1), (12.2) and (12.3), provide the following results, respectively, \( S=64 \), \( V\left[S\right]=44084 \), and \( Z=0.300 \). At the significance level \( \alpha =0.05 \), \( {z}_{1-\alpha /2}=1.96 \). Since \( \left|Z\right|<{z}_{1-\alpha /2} \), the null hypothesis (no trend) is not rejected.

  2. (b)

    The test is two-tailed, so the test statistic is given by Eq. (12.7). Hence \( {K}_N=285 \), which corresponds to the value of U t,N at \( t=60 \). To calculate the p-value, Eq. (12.11) is used, thus resulting in \( p=0.581 \). Since \( p>\alpha \), the null hypothesis (no change point) is not rejected.

Example 12.2

The Três Marias reservoir on the São Francisco upper river catchment, with a drainage area of 50,600 km2, in southeastern Brazil, has a total storage volume of 21 hm3 \( \left(21\times {10}^9\;{\mathrm{m}}^3\right) \) and started operation in 1961. Downstream of the dam, at the location of Pirapora, a study is carried out on whether the operation of the reservoir casts a significant regulation on the streamflow. Consider the hydrological series of July mean monthly flows observed at the gauging station of Pirapora, from 1938 to 1992, displayed in Table 12.1. The São Francisco River catchment at Pirapora has a drainage area of 114,000 km2 and is located circa 90 km to the north, downstream of the Três Marias dam. July is usually the driest month in this region of Brazil and the July monthly discharges are supposed to be strongly affected by reservoir flow regulation. (a) Apply the nonparametric Mann–Whitney test (see Sect. 7.4.3) to assess whether the subsamples from 1931 to 1961 and from 1962 to 1992 are homogeneous at the 5 % significance level. (b) Apply the two-tailed Pettitt test at the 5 % significance level. Comment on the result of the Pettitt test, particularly in relation to the time position of the change point.

Table 12.1 July mean monthly flows of the São Francisco River at Pirapora, in Brazil, and auxiliary measures for implementing the Mann–Whitney and Pettitt change-point tests

Solution

  1. (a)

    Homogeneity hypothesis test. The fourth column of Table 12.1 shows the order ranks of the mean monthly flows in July. So \( {N}_1=24 \), \( {N}_2=31 \), and \( {R}_1=346 \). Equations (7.14) and (7.15) give \( {V}_1=674 \) and \( {V}_2=70 \), thus, \( V=70 \). Equations (7.16) to (7.18) give \( E\left[V\right]=372 \), \( \mathrm{V}\mathrm{a}\mathrm{r}\left[V\right]=3472 \) and \( T=-5.12 \). At the significance level \( \alpha =0.05 \), \( {z}_{1-\alpha /2}=1.96 \). Since \( \left|T\right|>{z}_{1-\alpha /2} \), the null hypothesis is rejected and, as a conclusion, the two subsamples are not homogeneous.

  2. (b)

    Pettitt change-point test. The fifth column of Table 12.1 shows the U t,N series (Eq. 12.5). From Eq. (12.7) the test statistic is \( {K}_N=658 \), which occurs at \( t=27 \), that is, in 1964. This is a two-tailed test, so the p-value is determined by Eq. (12.11) as \( p=4.38\times {10}^{-7} \). Since \( p<\alpha \), the null hypothesis is rejected, which suggests that the series exhibits a significant upward change-point in 1964. Figure 12.1 shows the time series X t and the change-point detected through the Pettitt test.

    Fig. 12.1
    figure 1

    July mean monthly flows of the São Francisco River at Pirapora: change point identified by the Pettitt test

Note that the Pettitt change-point does not occur when the upstream reservoir begins operation but 3 years after that, in 1964. In fact, the Pettitt change-point signals a sudden change from the statistical viewpoint, and it is not guaranteed to occur in tandem with the real underlying cause of change. It is also noteworthy that due to the great storage capacity of the reservoir (21 billion m3), the time of filling the reservoir should be taken into account until regular operation begins. However, this period cannot be determined with precision from the available data.

12.2 Kernel Occurrence Rate Estimation

Poisson processes are among the most important stochastic processes, as mentioned in Sect. 4.2. Under the stationarity assumption, the Poisson intensity λ is a constant when the process is homogeneous. If the stationarity assumption is not valid, the Poisson process is nonhomogeneous and λ(t) is a non-constant function of time.

The nonparametric kernel estimator was developed by Rosenblatt (1956) and Parzen (1962) as an estimation technique of probability density functions of random variables. Diggle (1985) adapted the technique for smoothing Poisson process data over time.

12.2.1 Formulation

Consider t as ranging from t 0 to t n , such that, at any instant t, either a “success” or a “failure” occurs and that there were M successes in all. If the times of occurrence of the M successes can be denoted by T i , with \( i=1,\dots, M \), the Poisson intensity, as a function of time λ(t) can be estimated by

$$ \widehat{\uplambda}(t)={h}^{-1}{\displaystyle \sum_{i=1}^MK\left(\frac{t-{T}_i}{h}\right)} $$
(12.12)

where \( K\left(\cdot \right) \) is a kernel function and h is the bandwidth. The units of \( \widehat{\lambda}(t) \) are the inverse of the discretization units of t. There are many kernel functions available in the technical literature. One of the most widely used is the Gaussian kernel, expressed as

$$ K(y)=\frac{1}{\sqrt{2\pi }} \exp \left(\frac{-{y}^2}{2}\right) $$
(12.13)

12.2.2 Bandwidth Selection

The selection of the bandwidth h determines the bias and variance properties of the estimator: a too small h results in fewer data points that effectively contribute to kernel estimation, which leads to a reduced bias and a high variance, as contrasted to a too large h which leads to an over-smoothing of the estimator, resulting in a small variance and increased bias. In practical terms, the selection of the bandwidth can be seen as a compromise between those two cases. The technical literature lists many optimization procedures and empirical formulae for selecting the optimal bandwidth. One of the most well-known procedures, which is present in many statistical packages, is Silverman’s rule of thumb (Silverman 1986, p. 48) defined by

$$ h=0.9\; \min \left\{s,\frac{\mathrm{IQR}}{1.34}\right\}{M}^{-\frac{1}{5}} $$
(12.14)

Where s, IQR and M are, respectively, the standard deviation, interquartile range and length of the sample T i . Equation (12.14) provides values of h that are considered suitable for a wide range of applications.

12.2.3 Pseudodata generation

A direct application of Eq. (12.12) to the series T i may lead to a boundary bias near the observation limits t 0 and t n consisting of an underestimation of λ(t) due to the nonexistence of data outside the interval [t 0, t n ]. This boundary effect may be reduced by generating pseudodata, denoted as pT, outside the range of observed data. A straightforward method to generate pseudotada consists of “reflecting” the observed data near the boundaries: for \( t<{t}_0 \), \( p{T}_i={t}_0-\left({T}_i-{t}_0\right) \), covering a range of 3 times h before t 0 and likewise to the right side, for \( t>{t}_0 \). Pseudodata generation is equivalent to the extrapolation of the empirical distribution of events near the boundaries (Cowling and Hall 1996). For this reason, the estimation of λ(t) near the boundaries should be interpreted with caution. Considering T i as the original point data augmented by the pseudodata and M the length of T i , Eq. (12.12) can be rewritten as

$$ \widehat{\uplambda}(t)={h}^{-1}{\displaystyle \sum_{i=1}^{M^{\dagger }}K\left(\frac{t-{T}_i^{\dagger }}{h}\right)} $$
(12.15)

Example 12.3

Consider the Poisson process characterized by the occurrence of over-threshold daily rainfalls at São Julião do Tojal, in Portugal. The available data series has a length of 39 water years; the water year in Portugal starts on October 1st. The selected threshold for the peaks-over-threshold sampling is 36 mm. Table 12.2 shows the dates of the occurrences as well as the peak rainfalls. The times of occurrence T i are relative to \( {t}_0=1 \), that is the beginning of the time series under analysis. Using the kernel occurrence rate estimator, as formulated in Eq. (12.12), estimate the temporal variation of the Poisson intensity of over-threshold rainfall occurrences (a) with, and (b) without pseudodata generation using the method described in Sect. 12.2.3

Table 12.2 Peaks-over-threshold series and times of occurrences of daily rainfalls exceeding the threshold u = 36 mm at São Julião do Tojal, in Portugal

Solution

The estimator’s bandwidth, according to Silverman’s rule of thumb (Eq. 12.14) is \( h=1,566\;\mathrm{d} \). Pseudodata generation uses the method of reflection, which consists of covering a range of 3h for \( t<{t}_0 \) and \( t>{t}_n \), with the reflection of the occurrences nearing the boundaries t 0 and t n , respectively. Table 12.3 shows the generated pseudodata points. The kernel occurrence rate estimate without pseudodata uses Eq. (12.12) directly, whereas the estimation with pseudodata uses Eq. (12.15), where T i is obtained by concatenating the T i values of Table 12.1 with those of Table 12.3.

Table 12.3 Pseudodata obtained with the reflection method

Figure 12.2 shows the kernel estimates obtained with and without pseudodata. To facilitate the interpretation of the results, \( \widehat{\uplambda}\kern-0.15em (t) \) was multiplied by 365.25, such that, for a given instant, t, the \( \widehat{\uplambda}\kern-0.15em (t) \) indicates the estimated number of occurrences above threshold per year. In Fig. 12.2, t takes 512 equidistant values between October 1st, 1955 (\( {t}_0=1 \)) and September 30th, 1994 \( \left({t}_n=14,245\right) \). The chart exemplifies the correction of the boundary bias via the pseudodata generation. The results show that the occurrence rate of over-threshold rainfall events peaked in the mid-1960s, followed by a few decades of lower intensity rainfalls.

Fig. 12.2
figure 2

Kernel intensity estimator applied to the occurrence of daily rainfalls exceeding the threshold of 36 mm at São Julião do Tojal

12.2.4 Estimation uncertainty: bootstrap confidence band

Point estimates given by Eq. (12.15) may be difficult to interpret without some measure of uncertainty associated with those estimates. For the purpose of quantifying that uncertainty, a pointwise confidence band around \( \widehat{\uplambda}(t) \) can be constructed using a bootstrap simulation technique. The bootstrap was originally proposed by Efron (1979) as a nonparametric approach for estimating parameter confidence intervals. This technique was later generalized for a wide range of statistical applications, particularly for problems whose solutions by analytical methods were cumbersome. Readers interested in bootstrap techniques are referred to Davison and Hinkley (1997).

For the specific problem of estimating the sampling uncertainty associated with λ(t), the procedure, outlined as follows, can be applied.

  1. 1.

    Generate a simulated sample T* of length M , by random sampling with replacement from the sample M .

  2. 2.

    Calculate \( {\widehat{\lambda}}^{*}(t) \), with Eq. (12.15), using the resampled data and the same bandwidth h.

  3. 3.

    Repeat the resampling-and-estimation procedure until a sufficiently large number of replicates of \( {\widehat{\lambda}}^{*}(t) \) is reached (for example, 2000 replicates).

  4. 4.

    For each time step t the \( 100\left(1-\alpha \right)\% \) confidence interval is defined by the empirical quantiles with a non-exceedance probability \( F=\alpha /2 \) and \( F=1-\alpha /2 \). For example, supposing 2000 replicates are used, for a confidence level of 90 %, the confidence interval is given by the 50th and 1950th order statistics of \( {\widehat{\lambda}}^{*}(t) \).

  5. 5.

    The confidence band is given by such confidence intervals for \( t\in \left[{t}_0,{t}_n\right] \).

The described methodology leads to the estimation of an empirical percentile type confidence band. Other types of bootstrap confidence bands are proposed by Cowling et al. (1996). Figure 12.3 shows the limits of the 90 % bootstrap confidence band applied to the case study of Example 12.3, based on 2000 replicates of \( {\widehat{\lambda}}^{*}(t) \). The confidence bands allow the sampling uncertainty to be accounted for in interpreting the results. Figure 12.3 suggests that the increase in extreme rainfall occurrence in the mid-1960s may not be very significant since the confidence band is wide enough to accommodate a constant Poisson intensity during the observation period.

Fig. 12.3
figure 3

Kernel intensity estimator and respective 90 % confidence band applied to the occurrence of daily rainfalls exceeding the threshold of 36 mm at São Julião do Tojal

12.3 Introduction to Generalized Linear Models (GLM)

Regression models are widely applied in statistics to analyze dependent structures between a response or dependent variable and one or more independent variables. Simple and multiple linear regression models are presented in Chap. 9. Those models, as well as other regression models, can find applications in the domain of analysis of nonstationary hydrologic series. In fact, the main question tackled by simple linear regression models is “how does variable Y behave as variable X changes?”. This is a central question in nonstationary analysis and modeling of hydrological time series as well.

Nonstationary hydrologic series analysis using regression models considers that there is a relationship between the variable under analysis (dependent variable) and independent variables, such as time and/or covariates (e.g., series of climate indices that evolve over time). In this kind of analysis, the regression residuals should be seen not as an error or imprecision, but as a realization of an independent (but not necessarily) identically distributed random variable. This understanding is illustrated in Fig. 9.5, where the mean of variable Y varies linearly with the dependent variable X and Y conditioned on X follows a Normal distribution.

The application of regression models to the analysis of nonstationary hydrologic series requires the same due attention and careful consideration, as recalled in Chap. 9. For instance, the identification of a linear relationship with time requires special consideration if extrapolation into the far future is needed (see Sect. 9.2.6).

The normal linear regression model is one of the most widely used statistical analysis tools and it is applied in many fields of knowledge and science. Nevertheless, it has its limitations. In many hydrological applications, the variable under analysis is discrete and cannot be modeled by a normal linear model. There are also cases in which the relationship between dependent and independent variables is nonlinear (this subject was addressed in Sect. 9.2.5). Since the 1970s there has been a profusion of generalizations of the normal linear model, driven by two important developments that are addressed here: the systematic use of exponential family distributions and the numerical capability of optimizing likelihood functions, through iterative weighted least squares algorithms.

This section summarizes the mathematical formalism of the generalized linear model in view of the advantages of its use in the context of analysis of nonstationary hydrological variables. For a more formal and comprehensive treatment of generalized linear models, the reader is referred to McCullagh and Nelder (1989), Dobson (2001) and Davison (2003).

12.3.1 Density and Link Functions

In Chap. 9, the three basic aspects of the normal linear model are covered. These are:

  1. 1.

    The mean of the response variable \( {\mu}_{y\Big|x}=E\left[Y\right] \) is related to the independent variable x through a linear predictor \( \eta ={\beta}_0+{\beta}_1X \);

  2. 2.

    The density function of the response variable Y, for each value of X, is Normal with mean \( {\mu}_{y\Big|x} \) and a constant variance σ 2; and

  3. 3.

    The mean of Y, for each X, is equal to the linear predictor,.

The generalized linear model (GLM) sustains the first definition, and relaxes the second and third. Regarding the density of the response variable, for a GLM, the response Y may be continuous or discrete, as long as its distribution belongs to the exponential family of distributions (not to be confused with the exponential distribution). A distribution belongs to the exponential family if its PDF, or PMF, can be written in the form

$$ {f}_Y(y)= \exp \left\{\frac{y\theta -b\left(\theta \right)}{a\left(\phi \right)}+c\left(y,\phi \right)\right\} $$
(12.16)

where θ is the canonical form of the location parameter, ϕ is the parameter of dispersion, and \( a\left(\cdot \right) \), \( b\left(\cdot \right) \) and \( c\left(\cdot, \cdot \right) \) are known real functions. This family comprises many well-known distributions such as the normal, gamma, binomial, and Poisson distributions.

In a GLM, the linear predictor and the mean of Y are related through a monotonic link function g

$$ \eta =g\left({\mu}_{y\Big|x}\right) $$
(12.17)

Therefore, the normal linear model is a special case of the generalized linear model in which Y has a normal density and g is the identity function. Other examples of well-known regression models which are special cases of the GLM are the logistic regression and the Poisson regression. Furthermore, the GLM setup extends the applicability of linear models to nonlinear cases, without the need for the transformations presented in Sect. 9.2.5.

From Eq. (12.16) it follows (McCullagh and Nelder 1989; Davison 2003) that the response variable has mean

$$ E\left[Y\right]={b}^{\hbox{'}}\left(\theta \right)=\mu $$
(12.18)

and variance

$$ \mathrm{V}\mathrm{a}\mathrm{r}\left[Y\right]=a\left(\varphi \right){b}^{\hbox{'}\hbox{'}}\left(\theta \right)=a\left(\phi \right)V\left(\mu \right) $$
(12.19)

where ′denotes differentiation with respect to θ. Hence, the variance of Y is a product of two functions, being that b″ (θ) is called the variance function and is usually represented by V(μ). The variance function depends only on the location parameter θ (and, thus, on the mean μ) and describes the variation of the variance as a function of the mean. The function a(φ) depends only on the dispersion parameter φ. It should be noted that, in most practical cases, \( a\left(\varphi \right)=\varphi /\omega \), where ω is a known constant.

Example 12.4

Consider the discrete random variable \( Y\sim P\left(\lambda \right) \) which follows the Poisson distribution. Characterize the probability mass function of the Poisson distribution, in the form of Eq. (12.16) and find the parameters of location and dispersion, θ and φ, the mean and variance of Y and the variance function V(μ).

Solution

The Poisson PMF is given by \( {p}_Y(y)=\frac{y^{\lambda }}{y!} \exp \left(-\lambda \right) \). After algebraic manipulation it is possible to rewrite the Poisson PMF as \( {p}_Y(y)= \exp\;\left\{y\; \ln \left(\lambda \right)-\lambda - \ln \left(y!\right)\right\} \) which is the form of Eq. (12.16) with \( \theta = \ln \left(\lambda \right) \), \( b\left(\theta \right)={e}^{\theta } \), \( a\left(\varphi \right)=1 \) (with ϕ and ω both equal to 1), and \( c\left(y,\phi \right)=- \ln \left(y!\right) \). In conclusion, \( E\left[Y\right]={b}^{\hbox{'}}\left(\theta \right)={e}^{\theta }=\lambda \) and \( \mathrm{V}\mathrm{a}\mathrm{r}\left[Y\right]=a\left(\phi \right){b}^{\hbox{'}\hbox{'}}\left(\theta \right)={e}^{\theta }=\lambda \). The variance function is \( V\left(\mu \right)={b}^{\hbox{'}\hbox{'}}\left(\theta \right)={e}^{\theta }=\lambda \). Table 12.4 summarizes the previous results and extends them to the normal and binomial distributions.

Table 12.4 Characterization of some distributions of the exponential family

The link function of a GLM as in Eq. (12.17) defines the relationship between the linear predictor η and the mean of Y, given by μ. The choice of this function should consider the specificities of the case under study. One of the possible choices is the canonical link function, which consists of having the linear predictor equal to the location parameter, i.e., \( \eta =\theta ={b}^{\hbox{'}-1}\left(\mu \right) \). For instance, if Y is Poisson-distributed, it is known that the mean of Y is always positive. Hence, a logical choice for a link function is the log function, since \( g\left(\mu \right)= \ln \left(\mu \right) \) ensures that μ can only take positive values. Furthermore, because \( \theta = \ln \left(\mu \right) \), the log link is the canonical link. Choosing the canonical link function has some theoretical advantages as laid out by McCullagh and Nelder (1989, p. 32). However, it is more important for the choice of the link function to be guided by substantial considerations specific to the problem at hand. Table 12.5 shows some of the more common link functions.

Table 12.5 Some link functions for generalized linear models

The PDF or PMF of Y is the random component of a GLM, while the link function is the structural or systematic component of a GLM. One can interpret the simple linear regression model as a particular GLM: Y has a Normal density and an identity link function. In Chap. 9, the Normal distribution is presented as the distribution of the “residuals.” In the GLM framework, the residuals are the realizations of the random variable around its mean. Notwithstanding the different terminologies applied to the simple linear model and the GLM with Normal density and identity link function, they are mathematically equivalent.

12.3.2 Estimation and Inference

GLM parameters are estimated using the maximum likelihood method. The likelihood function is given by

$$ \begin{array}{l}L\left({\beta}_j|{y}_i,{x}_i\right)={\displaystyle \prod_{i=1}^N{f}_Y\left({y}_i\Big|{\theta}_i,{\varphi}_i,{\omega}_i\right)}\\ {}={\displaystyle \prod_{i=1}^N \exp \left\{\frac{y_i{\theta}_i-b\left({\theta}_i\right)}{a\left({\varphi}_i\right)}+c\left({y}_i,{\varphi}_i\right)\right\}}\end{array} $$
(12.20)

One can further define the log-likelihood function \( \ell \left(\beta \right)= \ln \left(L\left(\beta \right)\right) \) as

$$ \ell \left({\beta}_j|{y}_i,{x}_i\right)={\displaystyle \sum_{i=1}^N\left\{\frac{y_i{\theta}_i-b\left({\theta}_i\right)}{a\left({\varphi}_i\right)}+c\left({y}_i,{\varphi}_i\right)\right\}} $$
(12.21)

where \( {\theta}_i={\eta}_i \), when the link function is canonical, and \( {\eta}_i={\beta}_0+{\beta}_1{x}_i \). The maximum likelihood estimators of the regression parameters \( {\widehat{\beta}}_i \) are obtained by solving the following system of equations.

$$ \frac{\partial }{\partial {\beta}_k}\ell \left({\beta}_j|{y}_i,{x}_i\right)=0 $$
(12.22)

The dispersion parameter φ may also be estimated by maximum likelihood, however such an estimation procedure has some practical difficulties (Davison 2003, p. 483). Alternatively, one can expand Eq. (12.19) to

$$ \begin{array}{l}\mathrm{V}\mathrm{a}\mathrm{r}\left[{Y}_i\right]=a\left(\phi \right)b\hbox{'}\hbox{'}\left({\theta}_i\right)\\ {}=\frac{\phi }{\omega_i}V\left({\mu}_i\right)\end{array} $$
(12.23)

Then, one has

$$ \phi =E\left[\frac{\omega_i{\left({Y}_i-{\mu}_i\right)}^2}{V\left({\mu}_i\right)}\right],i=1,\dots, N $$
(12.24)

From the previous equation and according to some asymptotic results from mathematical statistics outlined by Turkman and Silva (2000), it is possible to define the following estimator for ϕ

$$ \widehat{\phi}=\frac{1}{N-q}{\displaystyle \sum_{i=1}^N\frac{\omega_i{\left({Y}_i-{\widehat{\mu}}_i\right)}^2}{V\left({\widehat{\mu}}_i\right)}} $$
(12.25)

where q is the number of regression parameters. If there is only one dependent variable, \( q=2 \).

It is very difficult to solve the system of equations given by Eq. (12.22). To tackle that optimization, the iterative weighted least squares (IWLS) method , which is a variant of the Newton–Raphson algorithm, is recommended in the scientific literature on GLMs. To grasp the IWLS formalism, the works by Dobson (2001) and Davison (2003) are recommended. Furthermore, it should be noted that the IWLS method is implemented in the glm() function in the free statistical software package R (R Core Team 2013).

The regression parameter estimates \( {\widehat{\beta}}_k \) benefit from an asymptotic property of maximum likelihood estimators: they are asymptotically Normal with means β k and variances \( {\sigma}_{\beta_k}^2 \), given by the diagonal of the variance-covariance matrix, which was introduced in Sect. 6.7.2. If there is only one dependent variable, one has

$$ \mathbf{I}=\left[\begin{array}{cc}\hfill \mathrm{V}\mathrm{a}\mathrm{r}\left({\widehat{\upbeta}}_0\right)\hfill & \hfill \mathrm{C}\mathrm{o}\mathrm{v}\left({\widehat{\upbeta}}_0,{\widehat{\upbeta}}_1\right)\hfill \\ {}\hfill \mathrm{C}\mathrm{o}\mathrm{v}\left({\widehat{\upbeta}}_1,{\widehat{\upbeta}}_0\right)\hfill & \hfill \mathrm{V}\mathrm{a}\mathrm{r}\left({\widehat{\upbeta}}_1\right)\hfill \end{array}\right] $$
(12.26)

which is given by the inverse of the negated Hessian matrix, H, i.e., the matrix of second-order partial derivatives of the log-likelihood function at the point of maximum likelihood

$$ \mathbf{I}={\left(-\mathbf{H}\right)}^{-1}={\left(-{\left.\left[\begin{array}{cc}\hfill \frac{\partial^2\ell \left({\beta}_0,{\beta}_1\right)}{\partial {\beta}_0^2}\hfill & \hfill \frac{\partial^2\ell \left({\beta}_0,{\beta}_1\right)}{\partial {\beta}_0\partial {\beta}_1}\hfill \\ {}\hfill \frac{\partial^2\ell \left({\beta}_0,{\beta}_1\right)}{\partial {\beta}_1\partial {\beta}_0}\hfill & \hfill \frac{\partial^2\ell \left({\beta}_0,{\beta}_1\right)}{\partial {\beta}_1^2}\hfill \end{array}\right]\right|}_{\beta_k={\widehat{\beta}}_k}\right)}^{-1} $$
(12.27)

The calculation of the covariance matrix is very complex and is seldom done analytically. However, such a calculation is made numerically within the IWLS method. The numerical solution of the matrix can be obtained through the glm() function in R.

After the regression parameter estimates and corresponding covariance matrix are known, it is possible to construct confidence intervals for those parameters. Furthermore, with such intervals as a basis, one can construct a hypothesis test for the existence of a linear relationship between the link function and the linear predictor. This subject matter is covered in Sect. 12.3.3.1.

Example 12.5

The North Atlantic Oscillation (NAO) is a prominent pattern in climate variability over the northern hemisphere and refers to the redistribution of atmospheric masses between the Arctic and subtropical Atlantic. There are many studies in the technical literature that establish links between the NAO phase and rainfall in Western Europe, particularly during winter in the northern hemisphere. There are also a number of studies on the influence of the NAO on rainfall and river flows in the western Iberian Peninsula during winter months: when the winter NAO is in the negative phase, rainfall and river flows tend to be above normal, and vice versa (e.g. Lorenzo-Lacruz et al. 2011). The NAO is usually characterized by a standardized climatic index which is computed and made available by several organizations, such as the University of East Anglia’s Climate Research Unit (Jones et al. 1997, http://www.cru.uea.ac.uk/cru/data/nao/). Consider the case of Example 12.3. Consider the variable Y: “annual number of daily rainfalls exceeding the threshold \( u=36\;\mathrm{mm} \).” Y is a discrete random variable and admittedly follows a Poisson distribution. Using a GLM, analyze the relationship between the dependent variable Y and the independent variable X : “winter NAO index” (the annual November-to-March mean). Both variables are presented in Table 12.6. Estimate GLM regression parameters and the respective covariance matrix, considering the canonical link function of the appropriate model.

Table 12.6 Annual number of threshold exceedances Y and corresponding winter NAO index

Solution

The R code for solving this problem is presented in Appendix 11. Y is a Poisson-distributed variable and as such the GLM will have a Poisson probability mass function. In this case, one can write (from Table 12.4) \( b\left(\theta \right)={e}^{\theta } \). Since the canonical link function is given by \( {b}^{\hbox{'}-1}\left(\mu \right) \), \( g\left(\mu \right)= \ln \left(\mu \right) \). This link function suggests that the model is nonlinear. The GLM can then be expressed as \( \ln \left({\mu}_{Y\Big|\mathrm{N}\mathrm{A}\mathrm{O}}\right)={\beta}_0+{\beta}_1\mathrm{N}\mathrm{A}\mathrm{O} \), or, identically as \( {\mu}_{Y\Big|\mathrm{N}\mathrm{A}\mathrm{O}}= \exp \left({\beta}_0+{\beta}_1\mathrm{N}\mathrm{A}\mathrm{O}\right) \). By applying the function glm(), of the statistical software R, one obtains the results \( {\widehat{\upbeta}}_0=0.770334 \) and \( {\widehat{\upbeta}}_1=-0.286583 \), thus suggesting that the mean of Y decreases as the winter NAO index increases, that is, there tends to be more over-threshold events during the negative phase of the winter NAO. This result is concurrent with the previously mentioned studies. Figure 12.4 shows the resulting curve as well as the scatterplot of the observed data (Table 12.5).

Fig. 12.4
figure 4

Relation of winter NAO index and annual exceedance counts for Example 12.5

Finally one can obtain the covariance matrix with the command vcov(),

$$ \mathbf{I}=\left[\begin{array}{cc}\hfill 1.234574\times {10}^{-2}\hfill & \hfill 2.876015\times {10}^{-5}\hfill \\ {}\hfill 2.876015\times {10}^{-5}\hfill & \hfill 1.294348\times {10}^{-2}\hfill \end{array}\right] $$

12.3.3 Model Selection and Evaluation

Generalized linear models constitute a powerful and versatile modeling framework since: (1) they allow for several independent variables; (2) the dependent variable may be described by any probability density function that can be written as in Eq. (12.16); and (3) several link functions can be postulated. The versatility of this modeling methodology requires careful consideration in its application and in selecting the appropriate model for each analysis. Following are three basic and useful tools for statistical modeling using generalized linear models. For a deeper understanding of the selection and evaluation of generalized linear models, the textbook by Davison (2003) is recommended.

12.3.3.1 Hypothesis Tests for Regression Coefficients

As previously mentioned, in Sect. 12.3.2, GLM parameters are estimated by maximum likelihood, since MLE estimates are asymptotically Normal. This result can be used to design a hypothesis test for the regression coefficients β 1, which defines the slope of the linear predictor.

Given the asymptotic normality of the estimator of the regression coefficient β 1, one can define a test with the following hypotheses:

  • H 0: {there is no significant linear relationship (\( {\beta}_1=0 \))}.

  • H 1: {there is a significant linear relationship” \( \left({\beta}_1\ne 0\right) \)}.

The null hypothesis H 0 can be rejected at a 100α % significance level if

$$ \left|{\widehat{\upbeta}}_1\right|>{z}_{1-\alpha /2}{\sigma}_{{\widehat{\upbeta}}_1} $$
(12.28)

where \( {\sigma}_{{\widehat{\upbeta}}_1} \) is the standard error of the estimate and \( {z}_{1-\alpha /2} \) is the \( 1-\alpha /2 \) quantile of the standard Normal distribution.

Example 12.6

Analyze the results of Example 12.5. Apply the hypothesis test for regression coefficients to verify whether or not the linear relationship between variables Y and X is significant.

Solution

From the results of Example 12.5, \( {\widehat{\upbeta}}_1=-0.28658 \) and \( {\sigma}_{{\widehat{\upbeta}}_1}^2=0.01294 \) (second value from the diagonal of the covariance matrix). The standard error of the estimate is \( {\sigma}_{{\widehat{\beta}}_1}=\sqrt{0.01294}=0.1138 \). The limit of the region of rejection is calculated as \( {z}_{1-\alpha /2}{\sigma}_{{\widehat{\upbeta}}_1}=1.96\times 0.1138=0.2230 \). Since \( \left|{\widehat{\upbeta}}_1\right|>{z}_{1-\alpha /2}{\sigma}_{{\widehat{\upbeta}}_1} \), the decision should be to reject the null hypothesis (\( {\beta}_1=0 \)). Therefore, it can be concluded that the linear relationship modeled in Example 12.5 is significant.

12.3.3.2 Likelihood Ratio Tests

Likelihood ratio tests (LRT ) are based on asymptotic results from mathematical statistics (see Casella and Berger 2002; Davison 2003). An LRT is a method for comparing the performance of two competing models fitted to the same sample of the dependent variable: a null model M 0 and a more complex (with more parameters) alternative model M 1, being that M 0 is nested in M 1. By nested it is meant that M 1 reduces to M 0 if one or more of its parameters are fixed.

A k-parameter alternative model M 1, with parameter vector \( {\theta}_{M_1} \), can be compared with the null model with \( k-q \) parameters, by calculating the test statistic D, also named deviance statistic. It is given by

$$ D=2\left\{\ell \left({\widehat{\theta}}_{M_1}\right)-\ell \left({\widehat{\theta}}_{M_0}\right)\right\} $$
(12.29)

where \( \ell \left({\widehat{\theta}}_{M_1}\right) \) and \( \ell \left({\widehat{\theta}}_{M_0}\right) \) are the maxima of the log-likelihood function of the alternative and null models, respectively. The deviance statistic follows a chi-square distribution with q degrees of freedom:

$$ D\sim {\chi}_q^2 $$
(12.30)

where q is the difference in number of parameters between the two models.

The null hypothesis of an LRT \( {H}_0:\left\{M={M}_0\right\} \) can be rejected in favor of the alternative hypothesis \( {H}_1:\left\{M={M}_1\right\} \), at the 100α % significance level, if

$$ D>{\chi}_{1-\alpha, q}^2 $$
(12.31)

where \( {\chi}_{1-\alpha, q}^2 \) is the \( 1-\alpha \) quantile of the chi-square distribution with q degrees of freedom.

The LRT result should be interpreted from the viewpoint of the concept of parsimony: if the deviance statistic D takes a value outside the region of rejection (i.e., \( D<{\chi}_{1-\alpha, q}^2 \)), the underlying implication is that, notwithstanding the null model’s simplicity relative to the alternative model, it is sufficiently adequate to describe the variable under analysis, or, in other words, the “gain” in likelihood achieved by increasing the complexity of M 1 relative to M 0, does not bring about a significant improvement of its capability of modeling the variable.

Example 12.7

Consider Example 12.5. Use an LRT at the 5 % significance level to test whether the fitted model is significantly better than that in which variable Y is considered to be stationary (base model).

Solution

(The R code for solving this problem is presented in Appendix 11) The base model M 0 is stationary: Y follows a Poisson distribution with a single parameter which is estimated by the mean. The model fitted in Example 12.5 is the alternative model M 1. M 0 is nested in M 1 because the parameter β 1 is fixed and equal to zero, then the two models are mathematically equal.

The log-likelihood function of the Poisson distribution is \( \ell \left(\lambda \right)=-\lambda N+ \ln \left(\lambda \right)\sum_{i=1}^N{y}_i- \ln \left(\prod_{i=1}^N,{y}_i!\right) \). The point of maximum likelihood is defined by \( \lambda =\widehat{\lambda}=\overline{y}=2.076923 \). Then, the maximum of the log-likelihood of model M 0 is \( \ell \left({\widehat{\theta}}_{M_0}\right)=-\overline{y}N+ \ln \left(\overline{y}\right)\sum_{i=1}^N{y}_i- \ln \left(\prod_{i=1}^N,{y}_i!\right)=-66.23 \).

Using the logLik() function in R, one gets \( \ell \left({\widehat{\theta}}_{M_1}\right)=-62.98 \). From Eq. (12.29), the result \( D=6.49 \) is attained. M 1 has one parameter more than M 0, so \( D\sim {\chi}_1^2 \). From Appendix 3, \( {\chi}_{0.95,1}^2=3.84 \). Since \( D>{\chi}_{0.95,1}^2 \), the decision is to reject the null hypothesis, that is, the performance of the more complex alternative model M 1 improves significantly on that of model M 0, hence it can be concluded that the winter NAO index casts a significant influence on variable Y.

12.3.3.3 Akaike Information Criterion (AIC)

The Akaike information criterion (AIC ) was introduced by Akaike (1974). It is a widely use method for evaluating statistical models. Founded in information theory, this method does not involve defining hypothesis tests, since there are no null and alternate hypotheses. Rather, it is simply a measure of model performance of a series of candidate models, based on the parsimony viewpoint.

In more general terms than those applied in Sect. 8.3.2, for a given statistical model, the AIC score is given by

$$ \mathrm{A}\mathrm{I}\mathrm{C}=2k-2\ell \left(\widehat{\uptheta}\right) $$
(12.32)

where k is the number of parameters and \( \ell \left(\widehat{\uptheta}\right) \) is the maximum of the log-likelihood of the model. The AIC score may be computed for a series of candidate models for the same random sample. According to this criterion, the best model is the one that minimizes the AIC score. AIC rewards the goodness-of-fit of the models but penalizes the increase in complexity (number of parameters).

Example 12.8

Apply AIC to corroborate the results of Example 12.7.

Solution

From Example 12.7: model M 0 has one parameter and \( \ell \left({\widehat{\theta}}_{M_0}\right)=-66.23 \); and model M 1 has 2 parameters and \( \ell \left({\widehat{\theta}}_{M_1}\right)=-62.98 \). Equation (12.32) yields \( {\mathrm{AIC}}_{M_0}=134.45 \) and \( {\mathrm{AIC}}_{M_1}=129.96 \). Since \( {\mathrm{AIC}}_{M_1}<{\mathrm{AIC}}_{M_0} \), M 1 is the better model, which corroborates the results of Example 12.7.

12.4 Nonstationary Extreme Value Distribution Models

12.4.1 Theoretical Justification

Extreme-value theory (EVT ) , which was introduced in Sect. 5.7, provides 3 limiting probability distributions for maximum (or minimum) extreme values, namely the Gumbel, Fréchet, and Weibull distributions. These distributions can be integrated into a single distribution—the generalized extreme value (GEV) distribution (Eq. 5.70). As mentioned in Sect. 5.7.2.3, the GEV distribution has many applications in Statistical Hydrology, such as the statistical modeling of floods or extreme rainfalls, even though the theoretical assumptions that support EVT do not always hold for hydrological variables. In practice, although the theoretical basis of the extremal asymptotic distributions has been specifically developed to analyze extreme value data, frequency analysis using such distributions is carried out analogously as with non-extremal distributions.

As observed by Katz (2013), when confronted with trends in extreme value data, hydrologists tend to abandon analysis frameworks based on EVT, in favor of nonparametric techniques such as the Mann–Kendall test, which is a powerful tool with no distributive restrictions. Nevertheless, it was not developed specifically for extreme values. Another common approach is to conduct inference about trends in extreme value data using a simple linear regression model, which has as one of its central premises that the data is normally distributed. Clarke (2002) points out the inherent methodological inconsistency underlying the application of these approaches to extreme value data: it seems that practitioners of hydrology accept one theory (even if approximate) under stationarity but move away from it under nonstationarity.

In his textbook on EVT, Coles (2001) introduced (based on previous developments by Davison and Smith 1990) nonstationary GEV distribution models, including the particular case of the Gumbel distribution. Coles (2001) argues that it is not possible to deduce a general asymptotic theory of extreme values under nonstationarity, except in a few very specialized forms which are too restrictive to describe nonstationary behavior in real-world applications. However, it is possible to take a pragmatic approach by using existing limiting models for extremes with enhanced estimation procedures, namely regression techniques, which allow modeling the parameters of the GEV or GPA distributions as functions of time. The idea is similar to generalized linear models: the nonstationary GEV model uses link functions applied to the location scale and shape parameters. These link functions are related to linear (or, e.g., polynomial) predictors. Under this approach, at any given time step, the variable is still described by an extreme value distribution, but its parameters are allowed to change over time. Therefore, the statistical model can still be interpreted, although in a contrived manner, in the scope of extreme value theory.

In this subsection, nonstationary models based on limiting distributions for extreme values are presented, namely the GEV and Gumbel (as a special case) distributions, and the generalized Pareto distribution, whose applicability falls in the domain of peaks-over-threshold analysis. It should be noted that the GEV and Gumbel distributions considered here are for maxima. The methodologies described are equally applicable to minima, provided that the negated samples are used.

The presentation of the models is complemented with application examples using the ismev package (Heffernan et al. 2013) in R. That package considers the parametrization of the GEV and GPA distributions used by Coles (2001), which differs from the one adopted in this textbook: the shape parameter of the distributions is, in Coles’ parametrization, the symmetrical of the shape parameter used in this textbook.

12.4.2 Nonstationary Model Based on the GEV Distribution

The nonstationary GEV model for frequency analysis consists of fitting that distribution to an observed sample and of estimating one or more of its parameters as a function of time or of a covariate. A covariate is a variable that can admittedly exert a time dependence on the hydrological variable under study (e.g. annual maximum flows). Examples of covariates are climate indices (see North Atlantic Oscillation in Example 12.5) and indicators of anthropogenic influence on the catchments.

Consider a series of annual maximum flows X t that show some signs of changing behavior with time. In order to describe the changing behavior of this extreme variable, it is possible to contemplate the following nonstationary flood frequency model based on the GEV distribution with time-varying parameters β, α, and κ:

$$ {X}_t\sim \mathrm{G}\mathrm{E}\mathrm{V}\left(\beta (t),\alpha (t),\kappa (t)\right) $$
(12.33)

where functions β(t), α(t), and κ(t) define the dependence structure between the model parameters and time.

As shown by Eq. (5.72), the mean of a GEV-distributed variable has a linear dependence on the location parameter β. Then, in order to model a linear temporal trend of a GEV-distributed variable, one can use a nonstationary GEV model with location parameter

$$ \beta (t)={\beta}_0+{\beta}_1t $$
(12.34)

where β 1 determines the rate of change (slope) of the variable. There can be other more convoluted parametric dependence structures for the GEV such as, for example, a 2nd degree polynomial, that is,

$$ \beta (t)={\beta}_0+{\beta}_1t+{\beta}_2{t}^2 $$
(12.35)

or a change point at time t 0

$$ \beta (t)=\left\{\begin{array}{c}\hfill {\beta}_1,\mathrm{f}\mathrm{o}\mathrm{r}\kern0.5em t<{t}_0\hfill \\ {}\hfill {\beta}_2,\mathrm{f}\mathrm{o}\mathrm{r}\kern0.5em t>{t}_0\hfill \end{array}\right. $$
(12.36)

Nonstationarities may also be introduced in the scale parameter α. This is particularly useful when analyzing changes in variance. A convenient parametrization for a time-changing scale parameter uses the exponential function so as to guarantee that α(t) can only take positive values, that is,

$$ \alpha (t)= \exp \left({\alpha}_0+{\alpha}_1t\right) $$
(12.37)

This link is log-linear since it equates to applying a linear relationship to the logarithm of α(t).

For samples of only a few dozen values, as is generally the case in Statistical Hydrology, it is difficult to make a proper estimation of the GEV shape parameter κ even under the stationarity assumption. For that reason, the shape parameter is usually fixed in nonstationary models. Furthermore, to consider a trend in the shape parameter frequently leads to numerical convergence issues when estimating the model parameters. Then, in practice, one should consider \( \kappa (t)=\kappa \).

The parameters of the nonstationary GEV model are estimated by maximum likelihood, thus allowing more flexibility for changes in the model structure. The nonstationary GEV model with changing parameters, according to link function of the type of Eqs. (12.34) to (12.37), has the likelihood function

$$ L\left(\theta \right)={\displaystyle \prod_{t=1}^N{f}_X\left[{x}_t\Big|\beta (t),\alpha (t),\kappa (t)\right]} $$
(12.38)

where f X (x t ) is the probability density function of the GEV, given by Eq. (5.71). Then, the log-likelihood function of the model is obtained as

$$ \ell \left(\theta \right)=-{\displaystyle \sum_{t=1}^N\left\{ \ln \left(\alpha (t)\right)+\left(1-\frac{1}{\kappa (t)}\right) \ln \left[1-\kappa (t)\left(\frac{x_t-\beta (t)}{\alpha (t)}\right)\right]+{\left[1-\kappa (t)\left(\frac{x_t-\beta (t)}{\alpha (t)}\right)\right]}^{\frac{1}{\kappa (t)}}\right\}} $$
(12.39)

subject to

$$ 1-\kappa (t)\;\left(\frac{x_t-\beta (t)}{\alpha (t)}\right)>0 $$
(12.40)

for \( t=1,\dots, N \).

Parameter estimates are obtained by maximizing Eq. (12.39), which is a complex numerical procedure for which the IWLS method is usually employed. Therefore it is somewhat similar to the GLM, but specifically developed for extreme value data. The function gev.fit() of the package ismev, in R, is one of the free available tools for building and fitting this kind of model.

As in the case of the GLM, parameter estimates of the nonstationary GEV model benefit from the asymptotic properties of maximum likelihood estimators, which enable approximations of the sampling distribution of parameter estimates to the Normal distribution, with mean given by the maximum likelihood estimate and variance given by the diagonal of the covariance matrix I. Take as an example the nonstationary GEV model with a linear trend in the location parameter (Eq. 12.34), for which the parameter vector is \( \theta ={\left({\beta}_0,{\beta}_1,\alpha, \kappa \right)}^{\mathrm{T}} \), the covariance matrix I is

$$ \mathbf{I}=\left[\begin{array}{cccc}\hfill Var\left({\widehat{\beta}}_0\right)\hfill & \hfill Cov\left({\widehat{\beta}}_{\kern-0.15em 0},{\widehat{\beta}}_{\kern-0.15em 1}\right)\hfill & \hfill Cov\left({\widehat{\beta}}_{\kern-0.15em 0},\widehat{\alpha}\right)\hfill & \hfill Cov\left({\widehat{\beta}}_{\kern-0.15em 0},\widehat{\kappa}\right)\hfill \\ {}\hfill Cov\left({\widehat{\beta}}_{\kern-0.15em 1},{\widehat{\beta}}_{\kern-0.15em 0}\right)\hfill & \hfill Var\left({\widehat{\beta}}_{\kern-0.15em 1}\right)\hfill & \hfill Cov\left({\widehat{\beta}}_{\kern-0.15em 1},\widehat{\alpha}\right)\hfill & \hfill Cov\left({\widehat{\beta}}_{\kern-0.15em 1},\widehat{\kappa}\right)\hfill \\ {}\hfill Cov\left(\widehat{\alpha},{\widehat{\beta}}_{\kern-0.15em 0}\right)\hfill & \hfill Cov\left(\widehat{\alpha},{\widehat{\beta}}_{\kern-0.15em 1}\right)\hfill & \hfill Var\left(\widehat{\alpha}\right)\hfill & \hfill Cov\left(\widehat{\alpha},\widehat{\kappa}\right)\hfill \\ {}\hfill Cov\left(\widehat{\kappa},{\widehat{\beta}}_{\kern-0.15em 0}\right)\hfill & \hfill Cov\left(\widehat{\kappa},{\widehat{\beta}}_{\kern-0.15em 1}\right)\hfill & \hfill Cov\left(\widehat{\kappa},\widehat{\alpha}\right)\hfill & \hfill Var\left(\widehat{\kappa}\right)\hfill \end{array}\right] $$
(12.41)

and is given by the inverse of the symmetrical of the respective Hessian matrix of the log likelihood function at the point of maximum likelihood. The Hessian matrix, or matrix of second-order derivatives, is usually obtained by numerical differentiation of the log-likelihood function.

Example 12.9

Consider the series of annual maximum daily rainfalls, at the Pavia rain gauging station, in Portugal (organized by hydrologic year which, in Portugal starts on October 1st), presented in Table 12.7 and in Fig. 12.5. The records range from 1912/13 to 2009/10 with no gaps, adding up to 98 hydrologic years. Fit the following GEV models to the data: (a) a stationary model GEV0; (b) a linear trend in the location parameter GEV1; (c) Determine the quantile with a non-exceedance probability \( F=0.9 \) of both models relative to the year 2010/2011.

Fig. 12.5
figure 5

Series of annual maximum daily rainfalls at Pavia

Table 12.7 Annual maximum daily rainfalls at Pavia, Portugal (HY hydrologic year)

Solution

It is important to recall that the shape parameter returned by the functions of the ismev package is the symmetrical of the shape parameter in the GEV parametrization adopted in this book. Data can be imported from a file using the read.table function, or typed directly on the R console with the following command.

> pavia<-c(24.2, 31.3, 32.5, 33.5, 20.2, 38.2, 36.7, 35.2, 35.2, 25.3, 92.3, 30, 25.2, 50.4, 35.7, 40.5, 10.3, 40.2, 8.1, 10.2, 14.2, 15.3, 40.2, 20.4, 20.2, 32.8, 43.2, 29.8, 42.8, 45, 34.2, 32.8, 46.3, 31.9, 34.2, 24.3, 24.3, 24.3, 71.4, 37.4, 31.4, 24.3, 43.8, 58.2, 34.6, 40.2, 20.8, 69, 44, 27.2, 37.2, 36.7, 49, 38.9, 59.6, 63.3, 41.2, 46.6, 84.2, 29.5, 70.2, 43.7, 36.2, 29.8, 60.2, 28, 31.4, 38.4, 29.4, 34, 47, 57, 36.5, 84.2, 45, 95.5, 48.5, 38, 38.6, 26, 27, 58, 27.8, 37.5, 35.2, 27.5, 28.5, 52, 56.8, 80, 29, 55.2, 48.4, 33.2, 27.4, 27.4, 18.2, 34.2)

  1. (a)

    The function used to fit GEV models is gev.fit. Fit the stationary GEV model, GEV0, with the command

    > GEV0<-gev.fit(pavia)

    Object GEV0 contains all the information regarding the fitted model including parameter estimates, covariance matrix, maximum log-likelihood, and more. For details on R functions, the help file documentation can be looked up using the command

    > ?gev.fit

    Maximum likelihood estimates are stored in component $mle. The console returns a vector with the estimated of the location, scale and shape parameters, in the following order \( \left(\beta, \alpha, -\kappa \right) \):

    > GEV0$mle [1] 31.521385926 13.156409604 -0.007569892

  2. (b)

    In order to fit the nonstationary model GEV1, with a linear temporal trend on the location parameter \( \left(\beta (t)={\beta}_0+{\beta}_1t\right) \), it is necessary to create a single column matrix with the variable time, t, taking unit incremental values from 1 to 98 (the length of the sample), being that \( t=1 \) corresponds to hydrologic year 1912/13.

    > time<-matrix(1:98,ncol=1)

    Subsequently, the function gev.fit is applied with the following arguments

    > GEV1<-gev.fit(pavia,ydat=t,mul=1)

    The component $mle contains the parameter estimates in order \( \left({\beta}_0,{\beta}_1,\alpha, -\kappa \right) \).

    > GEV1$mle [1] 25.57537923 0.12052674 12.37868560 0.02447634

  3. (c)

    In order to estimate the quantile with non-exceedance probability \( F=0.9 \), for the hydrologic year 2010/11 (\( t=99 \)), the GEV quantile function may be applied directly, using Eq. (5.77). Regarding model GEV0, since it is stationary, the quantile function may be applied directly resulting in \( {x}_{0.9,{\mathrm{GEV}}_0}=60.88\;\mathrm{mm} \). Regarding model GEV1, the location parameter for year 2010/11 is determined as \( \beta \left(t=99\right)={\beta}_0+{\beta}_1\times 99=37.5075 \). From the application of Eq. (5.77), results \( {x}_{0.9,{\mathrm{GEV}}_1}=64.61\ \mathrm{mm}. \)

12.4.3 Nonstationary Model Based on the Gumbelmax Distribution

The Gumbelmax (or simply Gumbel) distribution is a limiting case of the GEV distribution when \( \kappa \to 0 \). Likewise to the GEV, it is possible to specify a nonstationary model for a hydrological variable based on the Gumbel distribution with time-varying parameters β and α, that is,

$$ {X}_t\sim \mathrm{Gum}\left(\beta (t),\alpha (t)\right) $$
(12.42)

where β(t) and α(t) define the dependence structure between the location and scale parameters and time.

The log-likelihood function of the nonstationary Gumbel model is given by

$$ \ell \left(\theta \right)=-{\displaystyle \sum_{t=1}^N\left\{ \ln \left(\alpha (t)\right)+\left(\frac{x_t-\beta (t)}{\alpha (t)}\right)+ \exp \left[-\left(\frac{x_t-\beta (t)}{\alpha (t)}\right)\right]\right\}} $$
(12.43)

The function gum.fit, of the R package ismev, can be used to fit this model by the IWLS method. This is covered in Example 12.10

Example 12.10

Consider the series of annual maximum daily rainfall at Pavia, in Portugal, shown in Example 12.9. Using the R package “ismev,” estimate the parameters of the following Gumbel models: (a) stationary model GUM0; (b) linear trend in the location parameter GUM1; (c) linear trend in the location parameter and log-linear trend in the scale parameter GUM2.

Solution

After loading the ismev package and importing the data into R, create a single-column matrix with time t between 1 and 98 (see Example 12.9a).

  1. (a)

    The stationary model GUM0 is fitted using the function gum.fit, as in

              > GUM0<-gum.fit(pavia)

    The parameter estimates in the order (β, α) are given by

    > GUM0$mle [1] 31.46559 13.13585

  2. (b)

    The same commands are applied to estimate parameters of the nonstationary model GUM1, with a linear trend on the location parameter \( \left[\beta (t)={\beta}_0+{\beta}_1t\right] \) and parameters in order (β 0, β 1, α), as in

    > GUM1<-gum.fit(pavia,ydat=t,mul=1) > GUM1$mle [1] 25.7563638 0.1201053 12.4740537

  3. (c)

    The same for model GUM2 with a log-linear trend in the scale parameter \( \alpha (t)= \exp \left({\alpha}_0+{\alpha}_1t\right) \) and parameter vector (β 0, β 1, α 0, α 1), as in

    > GUM2<-gum.fit(pavia,ydat=t,mul=1,sigl=1,siglink=exp) > GUM2$mle [1] 2.544135e+01 1.262875e-01 2.485020e+00 7.661304e-0 4

12.4.4 Nonstationary Model Based on the Generalized Pareto Distribution

The generalized Pareto (GPA) distribution has its origins in results from EVT, namely in the research by Balkema and de Haan (1974) and Pickands (1975). The GPA distribution is not usually used in frequency analysis of annual maxima, but it is widely applied to peaks-over-threshold data, frequently in combination with the Poisson distribution (see Example 8.8).

Consider a peaks-over-threshold series X t that shows some signs of changing behavior with time. It is possible to define a nonstationary flood frequency model based on the GPA distribution with scale parameter α and shape parameter κ, both changing with time, as denoted by

$$ {X}_t\sim \mathrm{G}\mathrm{P}\mathrm{A}\left(\alpha (t),\kappa (t)\right) $$
(12.44)

As in the case of the GEV model, the shape parameter of the GPA κ defines the shape of the upper tail of the distribution and the precise estimation of this parameter is complex. Likewise, regarding the nonstationary GPA model, it is not usual to allow the shape parameter to vary as a function of time. Therefore, as a rule, the only GPA parameter to be expressed as a function of time is the scale parameter α. Since that parameter can only take positive values, a convenient parametrization for α(t) is

$$ \alpha (t)= \exp \left({\alpha}_0+{\alpha}_1t\right) $$
(12.45)

The log-likelihood function of the nonstationary GPA is given by

$$ \ell \left(\theta \right)=-{\displaystyle \sum_{t=1}^N\left\{ \ln \left(\alpha (t)\right)-\frac{1}{\kappa (t)-1} \ln \left[1-\kappa (t)\frac{x_t}{\alpha (t)}\right]\right\}} $$
(12.46)

In the R package ismev, the appropriate function for fitting nonstationary GPA models is gpd.fit. The procedures are very similar to the ones described in Examples 12.9 and 12.10.

12.4.5 Model Selection and Diagnostics

The models presented in this subsection are highly versatile since they allow for: (1) having one or two nonstationary parameters; (2) the parameters being dependent on time directly or through a covariate (e.g., climate index); (3) many possible dependence structures between the parameters and time/covariate, i.e., linear, log linear, polynomial, change point, and other dependencies. Therefore, there are several candidate models to each problem in which a nonstationary extreme hydrological variable is present. Model selection under nonstationarity is an important issue, as the consideration of several covariates and possibly convoluted dependence structures can often result in very complex models which fit nicely to the data but may not be parsimonious. The basic aim here is to select a simple model with the capability of explaining much of the data variation.

The logic of model selection for nonstationary extremes is analogous to that of the GLM, whose main tools were presented in Sect. 12.3.3. Relative performances of nested candidate models may be assessed using asymptotic likelihood ratio tests and AIC can be used to select the best model from a list of candidates. The models which postulate a linear or log-linear dependence may be evaluated using hypothesis tests of the slope parameter.

The recommended practice for this kind of analysis is to start with a stationary model as a baseline model, with the lowest possible number of parameters, and gradually postulate incrementally complex models, that is, progressively add parameters, and check whether each alternate model has a significantly better performance than the previous one. It is important to be mindful than an LRT is only valid when the 2 models are nested.

Consider the hypothetical scenario in which a trend in the location parameter of a GEV-distributed variable is under analysis, but the parametric dependence structure of the trend is not obvious. One possible approach is to postulate the following models:

  • GEV0—“no trend”, baseline model, \( \beta (t)=\beta \);

  • GEV1—“linear trend,” \( \beta (t)={\beta}_0+{\beta}_1t \);

  • GEV2—“log-linear trend”, \( \beta (t)= \exp \left({\beta}_0+{\beta}_1t\right) \);

  • GEV3—“second degree polynomial trend”, \( \beta (t)={\beta}_0+{\beta}_1t+{\beta}_2{t}^2 \).

In this scenario, the GEV0 model is nested in any of the other 3 models, since, in any case, it is mathematically equal to having every parameter other than β 0 equal to zero. Obviously, GEV1 is nested in GEV3, so those two models can be compared by means of an LRT. GEV1 is not nested in GEV2 nor is GEV2 in GEV3, hence an LRT may not be used to compare those models. In this kind of situation, it is preferable to compute the AIC scores for all models and determine the “best” one, according to that criterion.

Regarding models GEV1 and GEV2, it is also possible to set up a hypothesis test for regression parameters using the same rationale as described in Sect. 12.2.3. For these models, an estimation of the standard error of the slope parameter \( {\upsigma}_{\upbeta_1} \) may be obtained by numerical differentiation of the log-likelihood function (see Sect. 12.3.2). One defines the null hypothesis H 0:{there is no trend in the location parameter} (or \( {\beta}_1=0 \)) and alternative hypothesis H 1:{there is a trend in the location parameter} (or \( {\beta}_1\ne 0 \)). At the significance level of 100α %, H 0 may be rejected if

$$ \left|{\widehat{\upbeta}}_1\right|>{z}_{1-\upalpha /2}{\upsigma}_{\upbeta_1} $$
(12.47)

Example 12.11

Consider the models GUM1 and GUM2 from Example 12.10. Determine the significance of the log-linear temporal trend of the scale parameter α(t) of model GUM2. (a) Use an LRT in which GUM1 is the null model; (b) use a hypothesis test for regression parameter α 1 of model GUM2. Consider the significance level of 5 % in both tests.

Solution

The R code for solving this problem is presented in Appendix 11. (a) An LRT is employed in which the null model M 0 is GUM1, with 3 parameters, and the alternative model M 1 is GUM2 with 4 parameters. The negated maximum log-likelihood can be obtained on the R console (see Examples 12.9 and 12.10), by calling the component $nllh of the objects generated by the functions gev.fit and gum.fit. As such, \( \ell \left({\widehat{\uptheta}}_{M_0}\right)=-402.5473 \) and \( \ell \left({\widehat{\uptheta}}_{M_1}\right)=-402.5087 \). From Eq. (12.29), the test statistic \( D=0.0772 \) is obtained through Eq. (12.29). The difference in number of parameters of both models is 1, such that \( D\sim {\chi}_1^2 \). Since \( D<{\chi}_{0.95,1}^2=3.84 \) (Appendix 3), the null model is not rejected in favor of the alternative one, at the 5 % significance level. (b) Standard errors of parameters estimates of models fitted using the R functions gev.fit and gum.fit, may be consulted by calling the component $se of the fitted model objects, which returns \( \left({\sigma}_{\beta_0}=2.683484,{\sigma}_{\beta_1}=0.048396,{\sigma}_{\alpha_0}=0.156172,{\sigma}_{\alpha_1}=0.002834\right) \). The rejection region of the test is \( {z}_{1-\alpha /2}{\sigma}_{\alpha_1}=0.005554 \). It is worth remembering that \( {\widehat{\upalpha}}_1=0.0007661 \) (Example 12.10). Since \( \left|{\widehat{\alpha}}_1\right|<{z}_{1-\alpha /2}{\sigma}_{\alpha_1} \), the null hypothesis is not rejected, thereby corroborating the results of (a).

The graphical analysis tools presented in Chap. 8 are no longer valid under nonstationarity. Those tools require that the data be identically distributed, but in the nonstationary case, the observations are not homogeneous, since their distribution changes with time. In order to deal with this issue, Coles (2001) suggests the use of modified Q−Q (quantile-quantile) plots to visualize the model fits of nonstationary extreme value models. When the fit is adequate, the scatter of points on the plot should be close to the 1-1 line. In order to apply this technique to a nonstationary model it is first necessary to transform the theoretical quantiles into a standardized and stationary variable.

In case the variable is GEV-distributed, \( {X}_t\sim \mathrm{G}\mathrm{E}\mathrm{V}\left(\beta (t),\alpha (t),\kappa (t)\right) \), the standardized variable \( {\tilde{X}}_t \) is defined by

$$ {\tilde{X}}_t=\frac{-1}{\widehat{\upkappa}(t)} \ln \left\{1-\widehat{\upkappa}(t)\left[\frac{X_t-\widehat{\upbeta}(t)}{\widehat{\upalpha}(t)}\right]\right\} $$
(12.48)

and when it’s Gumbel-distributed, \( {X}_t\sim \mathrm{GUM}\left[\beta (t),\alpha (t)\right] \),

$$ {\tilde{X}}_t=\frac{X_t-\widehat{\upbeta}(t)}{\widehat{\upalpha}(t)} $$
(12.49)

The variable resulting from those transformations follows a standardized Gumbel distribution (Gumbel with \( \beta =0 \) and \( \alpha =1 \)), with CDF

$$ {F}_{{\tilde{X}}_t}(x)= \exp \left(-{e}^{-x}\right) $$
(12.50)

The previous result enables the making of a standardized Gumbel Q−Q plot. By denoting the order statistics of \( {\tilde{x}}_t \) as \( {\tilde{x}}_{(1)},\dots, {\tilde{x}}_{(N)} \), the Q−Q plot is comprised of the pairs of points

$$ \left\{- \ln \left[- \ln \left({q}_i\right)\right],{\tilde{x}}_{(i)},i=1,\dots, N\right\} $$
(12.51)

where q i is the adopted plotting position (the Gringorten plotting position is recommended in the case of Gumbel and GEV models; see Sect. 8.1.2)

A similar technique may be applied when using a nonstationary GPA model. Considering the variable \( {X}_t\sim \mathrm{G}\mathrm{P}\mathrm{A}\left(\alpha (t),\kappa (t)\right) \), the distribution used to standardize the variable is the exponential distribution (see Sect. 5.11.4);

$$ {\tilde{X}}_t=-\frac{1}{\widehat{\upkappa}(t)} \ln \left\{1-\widehat{\upkappa}(t)\left[\frac{X_t}{\widehat{\upalpha}(t)}\right]\right\} $$
(12.52)

The resulting variable \( {\tilde{X}}_t \) follows a standardized exponential distribution (exponential with \( \theta =1 \)), with CDF

$$ {F}_{{\tilde{X}}_t}(x)=1- \exp \left(-x\right) $$
(12.53)

Then, the corresponding Q−Q plot consists of the following pairs of points:

$$ \left\{- \ln \left(1-{q}_i\right),{\tilde{x}}_{(i)},i=1,\dots, N\right\} $$
(12.54)

where q i is the adopted plotting position.

Example 12.12

Consider the 5 stationary and nonstationary models fitted through Examples 12.9 and 12.10. (a) Using AIC, select the best of those 5 models. (b) Check the fit of that model using a Q−Q plot.

Solution

  1. (a)

    In R, the component $nllh of the objects fitted in Examples 12.9 and 12.10 returns the negated maximum log likelihood of those models, which are shown in Table 12.8, together with the AIC results.

    Table 12.8 Computation of AIC

    The best of the 5 models, according to the AIC scores, is GUM1 (nonstationary Gumbel with a linear trend on the location parameter).

  2. (b)

    Since \( {X}_t\sim \mathrm{Gum}\left(\beta (t),\alpha (t)\right) \), the transformation of the theoretical quantiles to the standardized Gumbel distribution uses Eq. (12.49), in which the location parameter has a linear temporal trend \( \beta (t)={\beta}_0+{\beta}_1t \) and the scale parameter is fixed \( \alpha (t)=\alpha \). The Q−Q plot consists of the pairs of points indicated in Eq. (12.51). Table 12.9 shows the necessary calculations to graph the Q−Q plot, which, in turn, is shown in Fig. 12.6.

    Table 12.9 Construction of a Q−Q plot of a nonstationary Gumbel model based on the transformation to the standard Gumbel distribution
    Fig. 12.6
    figure 6

    Q−Q plot of nonstationary model GUM1 based on the transformation to the standard Gumbel distribution

12.5 Return Period and Hydrologic Risk in a Nonstationary Context

The concept of return period T of a quantile of a hydrological variable, defined by the inverse of the annual probability of exceedance of that variable, is an important and standard tool for hydrologists, with formal roots in the geometric distribution (see Sect. 4.1.2). Cooley (2013) contends that return periods are created to facilitate interpretation of the rarity of events: the expression “T-year flood” may be more easily interpreted by the general public than “a flood with an annual exceedance probability of 1/T”. The former definition leads to two interpretations of the T-year event:

  • The expected waiting times between two events is T years; or

  • The expected number of events in a T-year interval is one.

Under the stationarity assumption, both interpretations are correct.

Another notion closely related to return period is the hydrologic risk, defined by the probability that a reference quantile q T will be exceeded in N years, or, in other words, the probability of occurrence of at least one event larger than q T in N years. Under the independence and stationarity assumptions, hydrologic risk is given by Eq. (4.15).

The concepts of return period and hydrologic risk are commonly applied in engineering practice and are present in several textbooks on hydrology. However, these concepts do not hold under nonstationarity since the exceedance probabilities of hydrological extremes change from year to year. In any given year, there still exists a one-to-one correspondence between an exceedance probability and a particular quantile of the variable, but the idea of an annual return period is illogical and defeats its purpose as a concept for communicating hydrological hazard under nonstationarity.

As a result of a growing interest in nonstationary flood frequency analysis, some important developments for extending the concept of return period to nonstationarity have appeared in the technical literature. Some of these developments are presented in this section.

12.5.1 Return Period Under Nonstationarity

The first advances in extending the concept of return period to a nonstationary context are due to the work of Wigley (1988, 2009), who showed, in a simplified manner, how to consider nonstationarity when dealing with risk and uncertainty. Olsen et al. (1998) consolidated these original ideas with a rigorous mathematical treatment and defined the return period as the expected waiting time. Formally,

$$ T\left({q}_T\right)=1+{\displaystyle \sum_{t=1}^{\infty }{\displaystyle \prod_{i=1}^t{F}_{i\left({q}_T\right)}}} $$
(12.55)

where \( {F}_i\left(\cdot \right) \) is the CDF of the variable in year i. Equation (12.55) cannot be written as a geometric series and solving it for q T is not straightforward. Cooley (2013) shows that, in case F i (q T ) is monotonically decreasing as \( i\to \infty \), it is possible to obtain a bounded estimate of T(q T ) as

$$ 1+{\displaystyle \sum_{t=1}^L{\displaystyle \prod_{i=1}^t{F}_i\left({q}_T\right)}}<T\left({q}_T\right)\le 1+{\displaystyle \sum_{t=1}^L{\displaystyle \prod_{i=1}^t{F}_i\left({q}_T\right)}}+{\displaystyle \prod_{i=1}^L{F}_i\left({q}_T\right)}\frac{F_{L+1}\left({q}_T\right)}{1-{F}_{L+1}\left({q}_T\right)} $$
(12.56)

with the bounds being of a desired width by choosing a sufficiently large natural number L. Nevertheless, numerical methods must be employed in order to solve the bounds in Eq. (12.56) for q T . Salas and Obeysekera (2014) built upon the developments of Wigley (1988, 2009) and Olsen et al. (1998), and presented a unified framework for estimating return period and hydrologic risk under nonstationarity.

Parey et al. (2007) and Parey et al. (2010) focused on the interpretation of the return period as the expected number of events in T years being 1 and extended that concept to nonstationarity. Under this interpretation, a T-year flood q T can be estimated by solving the equation

$$ {\displaystyle \sum_{i=1}^T\left(1-{F}_i\left({q}_T\right)\right)=1} $$
(12.57)

where \( {F}_i\left(\cdot \right) \) has the same meaning as in Eq. (12.55). Solving Eq. (12.57) also requires numerical methods.

12.5.2 Design Life Level (DLL)

Rootzén and Katz (2013) argue that, for quantifying risk in engineering design, the basic required information consists of (1) the design life period of the hydraulic structure and (2) the probability of occurrence of a hazardous event during that period. These authors propose a new measure of hydrological hazard under nonstationarity: the Design Life Level, denoted as DLL, which is the quantile with a probability p of being exceeded during the design life period.

To compute the DLL it is necessary to derive the CDF \( {F}_{T_1:{T}_2} \) of the maximum over the design life period, in which T 1 and T 2 represent the first and the last year of the period, respectively. Formally,

$$ {F}_{T_1:{T}_2}(x)=P\left( \max \left\{{X}_t,t\in \left[{T}_1,{T}_2\right]\right\}\le x\right) $$
(12.58)

Another way to put it is the probability that every value of X t must simultaneously be lower than x, or

$$ {F}_{T_1:{T}_2}(x)=P\left[{\displaystyle \underset{t={T}_1}{\overset{T_2}{\cap }}\left({X}_t\le x\right)}\right] $$
(12.59)

Under the stationarity assumption (see Sect. 3.3) one would have

$$ {F}_{T_1:{T}_2}(x)={\displaystyle \prod_{t={T}_1}^{T_2}{F}_t(x)} $$
(12.60)

The DLL is obtained by numerically inverting Eq. (12.60) for the desired non-exceedance probability \( 1-p \). The design life level has a straightforward interpretation and does not imply extrapolations beyond the design life. Obviously, the design life level can also be estimated under stationarity. In that case, Eq. (12.60) is the complementary of the hydrologic risk (Eq. 4.15), with \( N={T}_2-{T}_1+1 \).

12.6 Further reading

Kundzewicz and Robson (2000, 2004) and Yue et al. (2012) review further methods for the detection of changes in hydrologic series and discuss at length the underlying assumptions and the adequate interpretation of the results of such methods. The presence of serial correlation in hydrologic time series, which is not uncommon in practice, may hinder the detection of trends or change points using the Mann–Kendall and Pettit tests, respectively. Serinaldi and Kilsby (2015b), and references therein, explore the limitations of these tests and suggest pre-whitening procedures designed to remove serial correlation from the data.

Mudelsee (2010) provides an in-depth characterization of the kernel occurrence rate estimation technique, including boundary bias reduction, bandwidth selection and uncertainty analysis via bootstrap techniques. Some examples of application of this technique are Mudelsee et al. (2003, 2004) and Silva et al. (2012).

Finally, it is important to stress that nonstationarity is a property of models and not of the hydrological/hydrometeorological phenomena underlying the time series used in statistical hydrologic analyses. In fact, there is an ongoing debate in the hydrological community on whether the use of nonstationary models is an adequate or even justifiable approach when tackling perceived changes in the statistical properties of hydrologic time series. A review of that debate is beyond the scope of this chapter since it is lengthy and comprises a number of different positions and proposed methodological approaches. Readers interested in such a debate are referred to the following sequence of papers: Milly et al. (2008, 2015), Koutsoyiannis (2011), Lins and Cohn (2011), Stedinger and Griffis (2011), Matalas (2012), Montanari and Koutsoyiannis (2014), Koutsoyiannis and Montanari (2015), Serinaldi and Kilsby (2015a). These works also tend to be very rich in references to up-to-date nonstationary hydrological analyses.

Exercises

  1. 1.

    Solve Example 12.3 (a) with bandwidth values \( h=500 \), \( h=1000 \) and \( h=2000 \). Comment on the results in light of the compromise between variance and bias in estimation.

  2. 2.

    Construct 90 % bootstrap confidence bands for each of the curves obtained in Exercise 2.

  3. 3.

    Show that the probability density function of the Gamma distribution can be written in the form of Eq. (12.16).

  4. 4.

    Solve Example 12.5 considering that \( Y\sim \mathrm{Binomial}\left(N,p\right) \) and use the AIC to compare the performances of the Poisson and Binomial models.

  5. 5.

    Consider the peaks-over-threshold data of Table 12.2. Fit a nonstationary GPA model with a log-linear dependence between the scale parameter and the winter (November-to-March) NAO index of the corresponding hydrologic year (NAO data shown in Table 12.6).

  6. 6.

    Using a likelihood ratio test, compare the performance of the model estimated in Exercise 5 with that of the corresponding stationary baseline model.

  7. 7.

    Using a hypothesis test for regression coefficients, check the significance of the log-linear relationship of the GPA scale parameter and the winter NAO index of the model estimated in Exercise 5.

  8. 8.

    Consider the series of annual maximum rainfalls at Pavia as listed in Table 12.6. Fit a nonstationary GEV model with a linear trend in the location parameter and a log-linear trend in the scale parameter.

  9. 9.

    Consider the model GUM1 from Example 12.10. Compute the expected waiting time for the exceedance of \( {q}_T=100\;\mathrm{mm} \), taking the year 2010/11 as reference.

  10. 10.

    Consider the model GUM1 from Example 12.10. Compute the design life level with a non-exceedance probability of \( F=0.9 \) and a design life of 50 years, taking the year 2010/11 as reference.