1 Introduction

Extreme hydrologic events frequency analysis is of vital importance for the planning and design of hydraulic engineering and is applied in a very broad range of studies and projects such as flood protection structures, power plants, river management, lake impoundments, and urban drainage projects. The independence and identical distribution (iid) of hydrologic data series is the basic assumptions in traditional flood frequency studies. However, this assumption is being not true due to global climate change and the intensification of anthropogenic activities such as the change of land use and growing construction of flood-prevention projects (Villarini et al. 2011; Deitch et al. 2013; Zuo et al. 2014). Change in land use would result in the variation of flood producing mechanisms which further lead to non-stationarity in flood series. Hence, there is no doubt that it will give rise to overestimation or underestimation in engineering hydrology design and flood control scheduling applying traditional flood frequency analysis under ever-changing environmental conditions (Cancelliere and Rossi 2013), and new methodologies which consider the non-stationarity of the flood series should be proposed and discussed in current flood control and water resources management practices.

Indeed, non-stationary frequency analysis is a relatively new modeling approach and enormous work has been undertaken in this field. Khaliq et al. (2006) summarized mostly available methods up to the year 2005 on frequency analysis of a sequence of dependent and/or non-stationary hydro-meteorological observations (He et al. 2006). In the review, several methods are classified to incorporate parameters alteration of distributions in traditional technique (Katz et al. 2002; Strupczewski et al. 2001; Cunderlik and Burn 2003; Sankarasubramanian and Lall 2003). From another perspective, in terms of hydrological extreme value series, increasing scientists analyzed that the flood series is not generated by a single population at all (Singh 1968; Waylen and Woo 1982; Singh et al. 2005). Potter (1958) found that annual maximum flood series frequency curves presented by dog leg shape, which suggested that flood series might be from two different populations. Singh and Sinclair (1972) developed a mixture of only two component distributions to fit annual flood series from 33 streams in IIIinois and the fitness of the mixed distribution is better than other methods. Rossi et al. (1984) applied a mixture of two exponential distributions on 39 annual flood series of Italian basins. Alila and Mtiraoui (2002) stated that the difference of upper tails and lower tails of flood frequency curves within the Gila River basin reveals that they are dominated by various flood-generating processes. They also explicitly indicated that mixed distribution provides a more appropriate and superior fit than conventional homogeneous distribution to floods on long-term hydro-climatic data. Smith et al. (2011) illustrated that the flood in eastern United States are generated by a mixture of tropical cyclones and extratropical systems. Generally, mostly above researches mainly concentrated on the flood frequency analysis. Talking about reservoir flood control, many researchers focus on improving flood control measures based on stationary flood series (Chen et al. 2013). However, the corresponding studies on the effects of series’ non-stationarity on the response of reservoir flood control are limited. In this work, the catchment of Xidayang Reservoir, which lies in Tang River, the south branch of Daqing River Basin was selected as study area, with a particular focus on the non-stationarity of its inflow annual maximum flood series due to the land use change and the construction of large-scale water resources developments projects in the whole catchment. At first, we proposed three steps to detect the change points, integrating three statistical methods with hydrological survey analysis (detailed explanation in the next section). Subsequently, this paper applies mixed distribution to analyze non-stationary inflow flood series of Xidayang Reservoir based on change point diagnosis, estimating parameters by simulating annealing algorithm. Then the methodology is used in the flood routing process to further explore the impacts of land use change on flood control operation. The findings from this study may lay the foundations for design flood revision and are of great interest to engineers to operate reservoirs.

The paper is organized as follows. Section 2 introduces methodology of change point diagnosis, mixed distribution model and the corresponding parameter estimation. Section 3 provides the study area and data. Section 4 describes the results and discussions, followed by Section 5 where we summarize the main findings of this study and conclude the paper.

2 Methods

2.1 Change Point Diagnosis

Essentially, a flood series is the products of the synthetic effects of climate, physical geography and anthropogenic activity, and series itself demonstrates the influence of those factors or variation characteristics. Before implementing flood frequency analysis, flood series should be detected whether it is non-stationary. Historically, scientists have used a large variety of statistical tests to find the change points and consequently it may have different results by using different methods (Reeves et al. 2007).

In this study, we proposed the following approach (see the flowchart in Fig. 1) to identify the change points in annual maximum flood (AMF) series. To begin with, Hurst (H) exponent method (Hurst et al. 1965; Bărbulescu et al. 2010) is used to determine whether the series exists variation in primary diagnosis and the corresponding variation degree classification (Xie et al. 2009) is shown in Table 1. Next in detailed diagnosis, nonparametric Mann–Whitney-Pettitt (MWP) test (Pettitt 1979; Li et al. 2014) is employed to check out the variation range and then is associated with Brown-Forsythe method (B-F; Brown and Forsythe 1974) and moving t-test (M-t; Fraedrich et al. 1997) to detect the detailed change points, which is convenient and efficient. Finally, detailed diagnosis results are united with hydrological survey analysis (here we refer to the analysis of the actual change of land use in the catchment using GIS dataset and the situation of soil and water conservation projects) to obtain final conclusion and confirm the most possible change point (MPCP) in comprehensive diagnosis.

Fig. 1
figure 1

The general framework of change point diagnosis

Table 1 Classification of variation degree based on Hurst exponent

2.2 Mixed Distribution Model

According to Alila and Mtiraoui (2002), a non-stationary extreme value distribution is composed of two or more stationary component distributions given by

$$ F(x)={\alpha}_1{F}_1(x)+{\alpha}_2{F}_2(x)+\cdots +{\alpha}_k{F}_k(x) $$
(1)

where F 1(x), F 2(x), ⋯, F k (x) are the cumulative distribution functions of the k component distributions, and α 1, α 2, ⋯, α k are their relative weights and satisfy the equation α 1 + α 2 + ⋯ + α k  = 1.

Simultaneously, Alila and Mtiraoui (2002) emphasized that the number of component distributions should be kept to a minimum, and finally they used a heterogeneous distribution composed of two homogeneous distributions. In this paper, similarly, considering the limitation of sample size and reducing the complexity of parameters estimation, we use the mixed distribution which is a mixture of two stationary component distributions. Considering a given non-stationary flood series X with sample size n, assuming that its change point is τ, subseries before the change point τ is X 1 with sample size n 1 = τ and a probability density function (PDF) f 1(x); the post- τ one is X 2 with sample size n 2 = n − τ and a PDF f 2(x); the whole series X follows a mixed distribution with PDF f(x), which is given by

$$ f(x)=\alpha {f}_1(x)+\left(1-\alpha \right){f}_2(x) $$
(2)

where α is weight coefficient. In China, the Pearson type III (P3) probability distribution is recommended to fit the annual flood discharge series in regulation of design flood calculation (Zhan and Ye 2000). Hence, we use P3 PDF to represent f 1(x) and f 2(x) respectively, which are given as below

$$ {f}_1(x)=\frac{\beta_1^{\alpha_1}}{\varGamma \left({\alpha}_1\right)}{\left( x-{a}_{01}\right)}^{\alpha_1-1}{e}^{-{\beta}_1\left( x-{a}_{01}\right)} $$
(3)
$$ {f}_2(x)=\frac{\beta_2^{\alpha_2}}{\varGamma \left({\alpha}_2\right)}{\left( x-{a}_{02}\right)}^{\alpha_2-1}{e}^{-{\beta}_2\left( x-{a}_{02}\right)} $$
(4)

And the exceedance probability distribution of the mixed distribution is calculated as

$$ F(x)=\alpha \left[\frac{\beta_1^{\alpha_1}}{\varGamma \left({\alpha}_1\right)}{\displaystyle {\int}_x^{\infty }{\left( x-{a}_{01}\right)}^{\alpha_1-1}{e}^{-{\beta}_1\left( x-{a}_{01}\right)} dx}\right]+\left(1-\alpha \right)\left[\frac{\beta_2^{\alpha_2}}{\varGamma \left({\alpha}_2\right)}{\displaystyle {\int}_x^{\infty }{\left( x-{a}_{02}\right)}^{\alpha_2-1}{e}^{-{\beta}_2\left( x-{a}_{02}\right)} dx}\right] $$
(5)

where α i , β i and a 0i (i = 1, 2) are shape, scale and location parameters of PDF f i (x), which can be represented by statistical parameters means EX i , variation coefficients C vi and skewness coefficients C si . Their relationships are given as follows: EX i  = a 0i  + α i /β i , \( {C}_{vi}=\sqrt{\alpha_i}/\left({\beta}_i{a}_{0 i}+{\alpha}_i\right) \) and \( {C}_{si}=2/\sqrt{\alpha_i} \). So in mixed distribution model, there are α, EX 1, C v1, C s1, EX 2, C v2 and C s2 up to seven parameters to be estimated.

2.3 Parameter Estimation

Parameter estimation in the treatment of mixed distribution remains a challenge. Historically, scientists have used numerous methods in flood frequency analysis, such as maximum likelihood algorithm (Rossi et al. 1984), maximum likelihood EM algorithm (Leytham 1984), the principle of maximum entropy (POME) (Fiorentino et al. 1987) and meta-heuristic algorithms (i.e. genetic algorithm and ant colony optimization; Hassanzadeh et al. 2011), etc. Lee and Jeong (2014) applied the harmony search (HS) meta-heuristic algorithm (an optimization technique) in mixed distribution. And this paper also regards parameters estimation as a combinational optimization problem and employs simulated annealing algorithm (SAA; Aarts and Korst 1989; He et al. 2006) to estimate the parameters in the mixed distribution model, which is similar with the HS method. Specifically, in SAA, the optimization problem is a single objective with minimizing the sum of absolute difference between empirical flood frequency and theoretical exceedance probability, which is formulated as follows:

$$ \begin{array}{cc}\hfill \mathrm{Minimize}\hfill & \hfill {\displaystyle \sum_{i=1}^n\left|{P}_T-{P}_E\right|}\hfill \end{array} $$

where i = 1,…,n, n is sample size, P T is theoretical exceedance probability, P E is empirical frequency.

3 Study Area

3.1 Study Area and Data

Xidayang Reservoir, which lies in 114°47′ E longitude and 38°44′ N latitude (see Fig. 2), has a drainage area of 4,420 km2 and a total storage capacity of 12.58 × 108 m3. It is one of the four large scale reservoirs in Hebei Province of North China which mainly focuses on flood control along with urban water supply, irrigation and electricity generation functions. And Xidayang reservoir was built in January 1958 and was completed 2 years later. Furthermore, after continued construction and reinforcement of the dam, its design flood control standard reaches to 500-year return period and its check standard is 10,000-year return period.

Fig. 2
figure 2

The Xidayang Reservoir catchment and its geographical location in China

In this paper, flood frequency analysis centers on annual maximum peak discharge series (AMPDS) and annual maximum flood volume (1-day, 3-day and 6-day) series (AMFVS) for the period 1952–2008 of Xidayang Reservoir. In addition, the flood in 1963 is the largest flood which Xidayang Reservoir has ever encountered since 1960, with a peak discharge of 7,940 m3/s. And there are 2 years’ historical peak discharge data, which are 10,700 m3/s in 1917 and 13,200 m3/s in 1939 respectively. We use the above hydrologic records to analyze design flood and apply it to the flood routing based on given flood routing rules, to further analyze the impacts of land use change on both the reservoir’s design flood and flood control operation.

3.2 Water Resources Projects and Land use Conditions

Since 1980s, growing number of soil and water conservation projects such as pools, water cellars and check dams have been established in Xidayang Reservoir catchment. As the history recorded (water conservation annals in Tang County 1998), in the Tang County, which accounts for 32 % drainage area of Xidayang Reservoir catchment, multiple soil and water conservation projects such as check dams, terrace, trees (especially in Mountains area) and riverbank protection engineering etc. have been established in large-scale from the year of 1980 when the Ministry of Electricity and Water issued a document about the soil and water conservation management in small basin. Moreover, the improving area reached to almost 1/3 area of Tang County in the first decade.

Meanwhile, the land use in this catchment has undergone a significant change from 1970 to 1980. The Fig. 3a–c present land use and land cover conditions of Xidayang Reservoir catchment in 1970, 1980 and 2000 respectively. Figure 3d demonstrates the occupation proportion of different land use types. As we can see from Fig. 3a and b, the converting from 1970 to 1980 witnesses that the coverage of grassland, cultivated land, construction land and waters have different extents of decreases. Conversely, the area percentage of forest land has a great increase from 14.82 to 19.40 % (Fig. 3d). Correspondingly, during the years of 1980 to 2000 (Fig. 3c), as shown in Fig. 3d, grassland and forest land shows a tiny decline and actually all of the land use types haven’t changed much. The relative large change of land use from 1970 to 1980 can alter flood generating mechanism and becomes the reason that change points occur around this period.

Fig. 3
figure 3

a, b and c are land use and land cover conditions of Xidayang Reservoir in 1970, 1980 and 2000 respectively; d Land use changes of Xidayang Reservoir catchment among 1970, 1980 and 2000

4 Results and Discussions

4.1 Change Points Diagnosis Analysis

4.1.1 Primary Diagnosis

According to Fig. 1, the H exponent method is used to identify whether the AMPDS and 1-day, 3-day and 6-day AMFVS of Xidayang Reservoir exist variation. The H exponent values of AMPDS, 1-day, 3-day and 6-day AMFVS are 0.650, 0.692, 0.707 and 0.713 respectively. It means that 1-day and 3-day AMFVS exhibit weak variation and 6-day AMFVS exhibits medium variation, while AMPDS shows no variation.

4.1.2 Detailed Diagnosis

On the basis of primary diagnosis results, 1-day, 3-day and 6-day AMFVS need to be detected in detail. Firstly, MWP test is used to find variation range with 5 % significance level and then to integrate B-F and M-t to check out change points of the above three flood series, with results indicated in Table 2. With an attempt to obtain reliable diagnosis results, we select four of the most significant change points in every statistical test according to their respective test criterions.

Table 2 Detailed diagnosis results of flood series at 5 % significance level

4.1.3 Comprehensive Diagnosis

As can be seen from Table 2, change points of 1-day, 3-day and 6-day AMFVS appear in 1964, 1965, 1979, 1990, 1996 and 2000. In terms of hydrological survey analysis, in 1963 and 1996, Xidayang Reservoir experienced two catastrophic flood events (are defined as “63.8” flood and “96.8” flood respectively) associated with heaviest rainfall, which demonstrates that the change point of 1964, 1965 and 1996 cannot be considered resulting from land use change (herein we only consider the non-stationary flood series as a consequence of land use change, regardless of the attribution of global climate change since we lack the corresponding meteorological dataset). And from the statistical aspect, the year of 2000, which is close to the end of flood series, is not reasonable to be a change point. More importantly, in view of the introduction in section 3.2, during the period between 1970 and 1980, there was a relatively large change in land use, with an increase in forest land and a decrease in grassland, cultivated land and waters. In contrast, over the 1980–2000 time period forest land reduced little along with a slight increase of cultivated land and other land use types remained unchanged basically. In addition, it is interesting to note that an increment in soil and water conservation projects in Daqing River Basin occurred since 1980s. This result supports the idea that engineering structures contribute to the non-stationary character of the streamflows and might induce a shift in annual maximum flood series (Salvadori 2013). Hence, above all, the MPCP is confirmed to be the year of 1979 for the three flood series, which is reliable and rational in both the statistical and physical aspects.

4.2 Flood Frequency Analysis

According to section 2.2 and section 2.3, non-stationary flood series are fitted by mixed distribution (MD), and parameter estimations for MD by SAA are given in Table 3. Meanwhile, AMPDS and non-stationary 1-day, 3-day and 6-day AMFVS are fitted by P3 distribution (Zhan and Ye 2000) whose parameters are estimated using the l-moment method (Hosking 1990). Figure 4 compares the fitness of MD and P3 distribution to non-stationary flood series. It turns out that P3 distribution misses the corner section of the empirical flood-frequency curve and results in larger deviations between the theoretical fit and the empirical data than MD. Initially, these little deviations may be neglected by an inexperienced viewer, however they could be quite substantial since a tiny deviation may lead to a huge difference in the design values, and thus bring about different treatments in the flood control and management practice. As viewed from the fit of MD, the corner section reflects more closely to empirical data than P3 distribution. Furthermore, it is worthy to point out that the reason why mixed distribution is the more appropriate fit than P3 distribution, which is not because it has more parameters. More details can be found in Alila and Mtiraoui (2002). We thereby suggest that the improved and superior fit, i.e., the mixed distribution to non-stationary flood series should be employed to implement design flood revision in Xidayang Reservoir.

Table 3 Parameters estimation of MD in Xidayang Reservoir
Fig. 4
figure 4

Fitting results of non-stationary flood series in Xidayang Reservoir: a, b and c are annual maximum 1-day, 3-day and 6-day flood volume respectively

Kolmogorov-Smirnov (K-S) test (Lilliefors 1967) is selected to test whether the MD is the underlying probability distribution to non-stationary flood series sample. Given a sample of N observations, the Kolmogorov-Smirnov statistic is

$$ D=\underset{x}{ \max}\left|{F}_0(x)-{F}_n(x)\right| $$
(6)

where F n (x) is the sample cumulative distribution function and F 0(x) is the distribution to be tested. If the value of D exceeds the D n(α) (n is sample size, α is significance level), one rejects the hypothesis that the observations are from the specified hypothesized distribution.

Consequently, the values of D for 1-day, 3-day and 6-day AMFVS are 0.164, 0.157 and 0.170 respectively, which are less than D n(α) (equals to 0.1801) with 5 % significance level. It appears that MD is accepted for all non-stationary flood series at the 5 % significance level, which interprets that the sample series follow MD.

4.3 Design Flood Comparison

The design flood values with different return periods according to MD and P3 distribution are already provided by the graphical information (see Fig. 4). The design flood values of each non-stationary flood series are summarized in Table 4 as well. For the same return periods, MD gives estimates of floods that decrease comparing with those estimated by P3 distribution. Specifically, the reduced magnitude of annual maximum flood volume (3-day and 6-day) is about 0.03 %–14.03 % and the decreased magnitude of annual maximum flood volume (1-day) is about 0.07 %–20.24 % with various return periods. Even though these differences are not very substantial, there are still important implications on engineering design and flood control. Therefore, non-stationarity in flood analysis should not be negligible.

Table 4 Comparison of design flood values of Xidayang Reservoir

4.4 Flood Routing Results Comparison

For illustrative and simplified purposes, we draw up two distinct cases to implement flood routing. In case 1, design flood values are calculated using MD based on non-stationary flood series due to land use change. Meanwhile, in case 2 design flood values are computed by traditional P3 distribution taking no account of the series’ non-stationarity.

In view of the above design flood values which are estimated using flood frequency analysis under two cases, the extraordinary flood process happened in 1963 flood season (see section 3.1) is selected as the typic flood hydrograph with a duration of 6 days and time interval of 1 h, and the conventional homogeneous frequency enlargement method is employed to work out design flood hydrographs. Subsequently, design flood hydrographs with different return periods, which are regarded as reservoir inflow flood hydrographs, are routed through the Xidayang Reservoir in order to investigate the influence of land use change on reservoir flood routing. The Xidayang Reservoir is one of the four large scale reservoirs in Hebei Province and plays a significant role in flood control system of Daqing River Basin. The characteristic parameters of Xidayang Reservoir are listed in Table 5. With the constraints of given flood routing rules, inflow flood hydrographs are routed through Xidayang Reservoir to obtain flood routing results corresponding to different return periods. During the routing process, flood routing results (maximum reservoir water level and maximum discharge) under the two cases (considering and ignoring the series’ non-stationarity) are compared. Figure 5 shows water level and discharge hydrographs for 10,000 and 100-year return periods under two cases. Flood routing results with various return periods of two cases are given by Table 6 and are plotted in Fig. 6.

Table 5 Characteristic parameters of the Xidayang reservoir
Fig. 5
figure 5

Water level and discharge hydrographs with return period of 10,000 (a1, a2) and 100 (b1, b2) year: (a1) and (b1) are under case1, the remaining are under case 2

Table 6 Flood routing results comparison between two cases of Xidayang Reservoir
Fig. 6
figure 6

Maximum water level (a) and Maximum discharge (b) comparison with different return periods under two cases of Xidayang Reservoir

For the sake of showing the flood routing results more clearly, the given flood routing rule is illustrated as follows.

  1. (1)

    The initial water level is 134.5 m;

  2. (2)

    If 134.5 m ≤ Z < 140.58 m, Q = 300 m3/s, protecting downstream watercourse;

  3. (3)

    If 140.58 m ≤ Z < 142.74 m, Q = 1,000 m3/s, protecting downstream counties and crop land;

  4. (4)

    If 142.74 m ≤ Z < 144.92 m, Q = 5,460 m3/s, protecting downstream railway bridge.

  5. (5)

    If 144.92 m ≤ Z < 150.49 m, using service spillway;

  6. (6)

    If Z ≥ 150.49 m, using emergency spillway.

where Z and Q represent water level and maximum safety discharge respectively.

Guided by flood routing rule, we can learn from Table 6 and Fig. 6 that comparing with maximum water level results under case 2, the results of case 1 have different degrees of decreases whose largest decrement is around half a meter. Similarly, along with the declines of maximum water level of case 1, maximum discharge results of case 1 also have an obvious downtrend with the decreasing rate from 0.00 to 81.68 % with different return periods contrast to results under case 2. It is evident that when the reservoir encounters 50-year design flood hydrograph, through flood routing, the maximum discharge under case 2 exceeds the result of case 1 by 81.68 %, which is massive and the reason is that based on flood routing rule, the water level of 142.74 m controls the discharge between 1,000 and 5,460 m3/s. In addition, it is worthy to note that for 20-year design flood hydrograph, the maximum water level under case 2 is 140.61 m which exceeds the normal pool level with 140.50 m but the one of case 1 with 140.20 m is the opposite. So it has to admit that although the gap of maximum water level between two cases is at most 0.56 m, the impact of non-stationary flood series on reservoir flood routing needs to be highly paid attention to.

5 Conclusions

The non-stationary flood frequency analysis is a fundamental key element for implementing reservoir flood routing under the condition of environmental change. Due to artificial disturbance, the flood series does not suffice the basic assumption of independent and identical distribution thus it is of desperate necessity to explore the effect of the flood series’ non-stationarity on reservoir. The main conclusions from this paper are summarized as follows:

  1. (1)

    In view of numeric statistical methods on change point detection, three steps are applied. The annual maximum flood volume (1-day, 3-day and 6-day) series of Xidayang Reservoir are identified exhibiting variation and all their change points occur in the year of 1979.

  2. (2)

    Mixed distribution is selected to fit non-stationary flood series, estimating parameters by simulating annealing algorithm. The result reveals that mixed distribution is reliable and reasonable to fit non-stationary flood series supplanting conventional distribution (P3 distribution) in Xidayang Reservoir, particularly at corner section of the empirical distribution. In addition, the decrement of design flood values estimated by mixed distribution comparing with the results computed by P3 distribution demonstrates that land use change in Xidayang reservoir control area leads to the reduce of inflow flood discharge, which implies that the traditional flood frequency analysis should not be considered rational.

  3. (3)

    Design flood hydrographs with various return periods are regarded as inflow flood hydrographs and are routed by Xidayang Reservoir. On the context of the two cases of considering and ignoring the series’ non-stationarity, maximum water level and maximum discharge under the former case are lower than the results under the latter case. In a word, our findings clearly indicate that ignoring even a weakly significant non-stationary in the flood series may overestimate the design flood and reservoir flood routing results. Furthermore, the results of the analysis highlight the series’ non-staitonarity influence on reservoir flood routing in response to the occurred land use change, which implies the importance of flood control scheduling and management in Daqing River Basin.

  4. (4)

    We believe that the Xidayang Reservoir constitutes a case study for other large scale reservoirs in Daqing River Basin, and the methodologies in this paper may act as a reference for the multi-reservoirs joint operations in Daqing River Basin where land use has undergone a significant change.