1 Introduction

Reliable catchment sediment yield (SY; tons per square kilometre per year; i.e. the quantity of sediment transported through a river section per unit of upstream area and per unit of time) is crucial for various purposes, such as decision making in environmental management, designing hydraulic structures, the calibration and validation of models predicting SY and for a better understanding of the relationship between geomorphic activity and various biogeochemical cycles (e.g. Owens et al. 2005; Viers et al. 2009; Vanmaercke et al. 2011a). The reliability of SY estimates is known to depend on the measuring method (e.g. Walling and Webb 1981; Phillips et al. 1999; Verstraeten and Poesen 2002; Moatar et al. 2006). However, SY is also known to vary significantly from year to year (e.g. Olive and Rieger 1992; Morehead et al. 2003; Bogen 2004; Achite and Ouillon 2007). As a result, the reliability of average SY values also depends on the length of the annual SY record, i.e. its measuring period (MP; years). Although the importance of inter-annual variability in SY is generally recognized, limited quantitative information is available on the magnitude of this inter-annual variability and the associated uncertainties for average SY values.

Various factors may be expected to influence the inter-annual variability of SY. Nordin and Meade (1981) suggest that variability decreases with increasing catchment area (A; square kilometres). Such a decrease could be expected as the temporal variation of SY for sub-catchments is likely to average out at larger scales. Furthermore, larger catchments can be expected to have larger floodplains which may buffer SY changes at longer time scales (e.g. Phillips 2003). Also, at the intra-annual scale, A is known to strongly control the temporal variability of sediment fluxes (e.g. Morehead et al. 2003; Gonzalez-Hidalgo et al. 2009). As a result, the required MP to determine SY with a given accuracy should decrease with increasing catchment area. However, not all studies confirm this trend: Olive and Rieger (1992) compiled long-term SY data for several river systems in different environments and found no clear evidence of decreasing inter-annual variability with increasing A.

Climatic variation has been observed to have a dominant impact on the annual sediment load of various catchments (e.g. Restrepo and Kjerfve 2000; Ionita 2006; Achite and Ouillon 2007; Tote et al. 2011), but it is not clear how climatic variability and sediment yield variability are linked. Walling and Kleo (1979) found no clear relationship between the annual variability in SY and mean annual precipitation, for a dataset of 256 catchments with at least 7 years of SY observations.

Also, land use may affect the inter-annual variability of SY. It is well-known that catchments under arable land are generally more sensitive to erosive rainfall events (e.g. Ward et al. 2009; Notebaert et al. 2011). This may not only result in larger average SY values but also in a larger temporal variability of SY for the same rainfall conditions. Johnson (1994) observed an increase in the variability of suspended sediment concentrations (SSC) after forestry works in two Scottish catchments, indicating a greater sensitivity of the SY for these catchments to climatic variability. Nevertheless, no study exists that quantifies the relative importance of land use in explaining the variability of annual SY compared to other factors.

Apart from our limited understanding about the magnitude and factors controlling inter-annual SY variability, also the effect of this variability on the reliability of average SY values remains difficult to assess. Based on the central limit theorem, previous studies (e.g. Walling 1984; Olive and Rieger 1992) have proposed to assess the uncertainty of average SY values due to inter-annual variability as:

$$ {\text{S}}{{\text{E}}_{\text{SY}}} = \frac{{{\text{C}}{{\text{V}}_{\text{SY}}}}}{{\sqrt {\text{MP}} }} $$
(1)

where SESY is the standard error of the mean SY and CVSY is its coefficient of variation. Although Eq. 1 suggests an easy method to assess the reliability of average SY data, the resulting standard errors and confidence intervals may be subject to large errors. Annual SY values for a given catchment are not necessarily normally distributed. Several studies show that average SY values can be strongly influenced by one or more extreme events (e.g. Bogen 2004; Achite and Ouillon 2007; Tomkins et al. 2007). This skewed nature of SY data limits the calculation of reliable SYmean confidence intervals, based on the central limit theorem (e.g. Hastings 1965; Bonett and Seier 2006).

As a result, very few guidelines exist on how long SY should be measured to obtain a representative average value under various conditions. Based on the analyses of long-term measuring campaigns at six gauging stations in Canada and using the approach of Eq. 1, Day (1988) suggests that a MP of one decade is sufficient to obtain sediment loads, as estimated standard errors ceased to decrease after 10 years. However, a closer inspection of the results of Day (1988) shows that although uncertainties did not decrease further, they were still considerably high. After 10 years of monitoring, the estimated SESY was still approximately 40%. Summer et al. (1992) arrived at a similar conclusion, based on data from six gauging stations in Austria. Feyznia et al. (2002) suggest optimal measuring periods between 9 and 25 years after statistical analyses of SY data for 13 catchments in Iran. However, given the limited sample size used in these studies and the limitations of the standard error approach, it is clear that the conclusions drawn from these studies may not be generally valid.

A major reason explaining this limited understanding of the factors controlling inter-annual variability and its effect on the reliability of average SY values is the lack of studies that quantify the inter-annual variability of SY for a sufficiently large number of catchments representing a wide range of environmental and catchment characteristics. With the exception of the earlier mentioned study by Walling and Kleo (1979), most studies on the inter-annual variability of SY focus on a limited number of catchments, often located in a specific geographic region. As indicated by Olive and Rieger (1992), these studies were often restricted by the relatively limited number of long-term SY observations. However, over the last decade or so, more and longer time series of annual SY data have become available, making a more general analysis possible.

Therefore, the objectives of this study are (a) to explore better the magnitude and factors controlling inter-annual variability of SY, based on a large dataset of measured SY that represents a wide range of environmental characteristics; (b) to provide and apply a method that allows the assessment of the reliability of average SY based on its MP and its inter-annual variability and (c) to discuss the implications of our findings for the reliability of average sediment yields.

2 Materials and methods

2.1 The sediment yield dataset

A database of catchments with individual annual suspended SY observations was compiled based on data to which the authors had access. A catchment was included in the analyses if (a) the catchment area (A) was known, (b) the latitude and longitude of the catchment outlet (i.e. the gauging station) was known with a precision of at least 5′ and (c) annual suspended SY data were available for at least seven individual years. The last selection criterion agrees with that proposed by Walling and Kleo (1979) and was considered as a reasonable trade-off allowing us to retain a sufficiently large number of catchments (representing a wide range of environmental characteristics) while having a sufficiently long MP to reflect the inter-annual variations. If reported, the annual runoff depths, corresponding to the annual SY observations, were also included in the dataset.

In total, 15,025 individual annual SY observations from 726 catchments were retained. For 558 catchments (representing 11,173 catchment-years of observations), the runoff depths corresponding to the annual SY observations were also known. Figure 1 displays the location of all outlets of the catchments considered in this study. The large majority of the catchments are clustered in the Middle East, Europe and the USA, while many regions are covered by few or no data. Nonetheless, the available data represent a wide range of physical and climatic conditions including the Arctic and boreal regions, temperate lowlands, various mountain ranges and (semi-) arid regions. Table 1 shows an overview of the collected data and their original source per country.

Fig. 1
figure 1

Location of all river gauging stations having at least 7 years of annual sediment yield (SY) data, considered in this study. Black dots indicate gauging stations for which the annual runoff depths corresponding to the annual SY are also available. Crosses mark gauging stations for which no runoff data are available. Sub-catchments of the Siret Basin (Romania) for which land-use data are available are shown in the inset

Table 1 Overview of the selected annual sediment yield (SY) time series per country

Although the details of the measuring procedure are not known for all catchments, the large majority of the annual SY data were obtained by applying runoff discharge (Q)–SSC rating curves to a series of continuously measured runoff discharges (with at least a daily temporal resolution) and integrating the resulting continuous sediment export values over a year (e.g. Palsson et al. 2000; Arabkhedri et al. 2004; EIE 2005, unpublished data). For some catchments, SSC values were only measured for a relatively short period, while the resulting Q–SSC rating curve was applied to a longer time series of runoff discharges (e.g. Close-Lecocq et al. 1982; Pont et al. 2002). However, for >95% of the catchments, SSC was sampled regularly throughout the MP. It is well-known that deriving total SY from rating curves is not without problems and may lead to an underestimation of true sediment loads if the appropriate corrections are not applied (e.g. Asselman 2000). However, it was not possible to assess to what extent such errors may have affected the SY values in our database, and we did not attempt to make any further correction to the considered data.

Figure 2 displays the frequency distribution of catchments according to their A and MP. Catchment areas range from 0.07 to 1.84 × 106 km2, with most catchments having an A between 100 and 10,000 km2. Measuring periods range between 7 and 58 years (average 20.7 years; median 17 years). The majority of catchments considered have a MP of 7 to 15 years. However, a significant fraction of catchments have a MP of 30 years. Most of these catchments are located in Iran (see Table 1). Regression analyses indicated no relationship between A and MP of the considered catchments. For most catchments, the MP consisted of a series of consecutive years. For 146 catchments, the sediment record showed a hiatus of one or more years. Evidently, years for which no SY value was available were not considered for the MP.

Fig. 2
figure 2

Frequency distribution of all catchments considered in this study, according to their catchment area (A; left) and measuring period (MP; right). The number of catchments (number sign) in each class for which both annual runoff and sediment yield (SY) data are available is indicated in black. The number of catchments for which only annual SY data are available is indicated in grey

2.2 Characterization of the inter-annual variability

A common measure to express the variability of a data series is the coefficient of variation (CV; percent; i.e. the standard deviation divided by the mean). However, analysis of the 726 time series of annual SY values indicated that more than half of the time series were not normally distributed (according to Lilliefors tests at a significance level of 5%; Lilliefors 1967). Most time series are positively skewed. Furthermore, it was observed that the fraction of normally distributed time series decreases as MP increases (Fig. 3). We therefore calculated several other non-parametric measures to evaluate the inter-annual variability in SY, such as the ratio between the maximum and minimum observed value and the coefficient of dispersion (Bonett and Seier 2006). However, none of these measures yielded significantly different results or trends and were generally less reliable. The maximum/minimum ratio was significantly correlated with MP and only considers two observations of the distribution, while the very low median SY of some catchments resulted in disproportionally large coefficients of dispersion. It was therefore decided to only show and discuss the results based on the CV.

Fig. 3
figure 3

Skewness of all annual sediment yield time series according to their measuring period (MP). Time series indicated in black were found to be normally distributed, according to a Lilliefors test at a significance level of 0.05

The non-normal nature of the considered time series does not allow us to interpret CVs in the context of Gaussian statistics. For example, they cannot be used to estimate standard errors as discussed in Section 1 or to calculate confidence intervals (e.g. Bonett and Seier 2006). Nonetheless, the CV is also a meaningful measure to describe the variability for non-normally distributed data (Hastings 1965). Furthermore, the CV has been commonly used in other studies discussing the inter-annual variability of SY (e.g. Walling and Kleo 1979; Olive and Rieger 1992).

2.3 Potential controlling factors of inter-annual variability

Several factors were considered that potentially explain observed differences in inter-annual variation in SY between catchments. For each catchment, the catchment area was derived from its original source. Furthermore, an estimation of the average annual rainfall depth (P mean, millimetres per year) and its variability was made, based on the CRU TS 2.0 dataset (Mitchell et al. 2004). This global dataset contains estimates of monthly rainfall on a 0.5° resolution for the period 1901–2000, based on measured records. The rainfall data of the grid cell in which the catchment outlet was located was considered to be representative for that catchment. This assumption was made because (a) for about half of the number of catchments in the database, A is smaller than one grid cell; (b) P mean values of neighbouring grid cells are generally strongly correlated with each other and (c) for a large number of catchments, the location of the gauging station was not sufficiently accurate to determine the upstream area from a digital terrain model. Furthermore, rainfall characteristics for each catchment were based on the entire period covered by the CRU TS 2.0 dataset and not for the MP of the annual SY observations. This was done because for about 207 catchments, no matching P mean data were available as the MP of the SY data included years after 2000. Moreover, for 304 catchments, annual SY observations were reported for hydrological years. The exact start and end dates of these hydrological years were often unknown or could not be matched with the reported rainfall data since only monthly values were available (i.e. when a hydrological year started in the middle of a month). Furthermore, P mean values and the coefficients of variation in annual rainfall (CVP) were generally found to differ little when they were calculated for the period 1901–2000 or for the MP of the SY data. Since the calculated P mean and CVP values are based on very coarse resolution datasets and do not correspond with the actual rainfall conditions that occurred during the MP of each SY time series, they are subject to important uncertainties. Nonetheless, it is expected that they provide reasonable indication of the rainfall characteristics in the considered catchments.

For the catchments with measured annual runoff depths available (n = 558, see Section 2.1), the mean annual runoff depths (Runoffmean, millimetres) and the coefficients of variation of the annual runoff depth (CVRunoff) were also considered as potential controlling factors. The evaluation of land-use and topography effects on SY variability can be expected to be difficult if a global dataset is used, as many other controlling factors will vary between different catchments. Therefore, a subset of catchments within a relatively small geographical region was selected. This subset consisted of 65 catchments in the Siret basin, Romania (INHGA 2010???, see Fig. 1). Measuring periods for these catchments ranged between 13 and 58 years (median 51 years). Although SY data from 67 catchments in the Siret basin were available, two sub-catchments are mainly located in Ukraine. These catchments were excluded from the subset, as their drainage area was not covered by the data layers used (see Fig. 1). For each catchment in this subset, the average catchment slope (%) was calculated, based on SRTM data with a 90-m resolution (CGIAR 2008), and the fraction of arable land in each catchment was determined based on the CORINE land cover map of 1990 (EEA 2010). Furthermore, an average estimate of sheet and rill erosion rate (SEM, tons per square kilometre per year) was calculated based on a recently published map of sheet and rill erosion rates in Europe (Cerdan et al. 2010). This map is based on empirical relationships between sheet and rill erosion rates and land-use, topography and soil characteristics and is expected to give an unbiased estimate of sheet and rill erosion rates.

2.4 Assessing the reliability of average sediment yields in terms of their inter-annual variability

As explained in Section 1, the generally non-normal distribution of SY time series (see Fig. 3) impedes us from using the central limit theorem to assess the uncertainty on average SY values with a known MP in terms of its inter-annual variability (Eq. 1). Therefore, a different strategy was developed to assess the reliability of average SY values as a function of their MP.

As a first exploration of the effects of inter-annual variability on the uncertainty, the relative error of the average SY for a given MP (SYmean,i ) was compared to the average SY after the first 30 years of monitoring (SYmean,30) for all time series with at least 30 years of SY observations (n = 226). The relative error was calculated as:

$$ \left( {{\text{S}}{{\text{Y}}_{{{\text{mean}},i}}}-{\text{S}}{{\text{Y}}_{{{\text{mean}},{3}0}}}} \right)/{\text{S}}{{\text{Y}}_{{{\text{mean}},{3}0}}} $$
(2)

where i is the MP and SYmean,30 is the mean SY after 30 years of measurement. For catchments with more than 30 years of SY observations, only the first 30 years were considered for the calculation of SYmean,30.

Clearly, the relative errors resulting from this approach cannot be used to assess the uncertainty on SYmean values for other catchments, as they are strongly controlled by the period in the MP during which exceptional high SY values occur. Furthermore, the relative errors converge to zero as the MP approaches 30 years, but even after 30 years of monitoring, uncertainty on SYmean values may still be considerable due to inter-annual variability. Therefore, a Monte Carlo simulation procedure was used to assess the uncertainty ranges on average SY values with a known MP due to inter-annual variability. This procedure is explained below.

As a first step, cumulative distribution functions (cdf) were fitted through each time series with at least 30 years of observations (n = 226). Contrary to Eq. 2, the full MP was considered for time series with more than 30 years of observations. While several potential distribution functions were tested, it was found that most annual SY records were best described with a Weibull (1951) distribution. Weibull distributions are commonly used in various fields, including hydrology, such as for flood frequency analyses (e.g. Heo et al. 2001). The cdf of a Weibull distribution is defined as:

$$ F(x) = 1 - e^{{ - {{\left( {x/\lambda } \right)}^k}}} $$
(3)

where F(x) is the probability of a value ≤x, λ > 0 is a scaling parameter and k > 0 is a shape parameter. Using a maximum likelihood estimate procedure, the λ and k parameters of Eq. 3 were fitted for each time series. For a large majority of the catchments, the cumulative distribution of the individual annual SY observations could be described well with Eq. 3, with a median r 2 of 0.91 between observed and fitted annual SY values. However, for some time series, Eq. 3 yielded a much lower r 2 (minimum 0.40), which could be attributed to a poor correspondence between observed and fitted values for the largest observed SY. Therefore, all distribution fits with an r 2 < 0.70 were excluded from the Monte Carlo simulations (n = 24). The r 2 values for the 202 remaining catchments ranged between 0.70 and 0.99 (median r 2 = 0.92, average r 2 = 0.90). A very strong negative correlation between the k parameter of Eq. 3 and the CVSY of the considered time series was observed (k = 1.06 × CV −94SY ; r 2 = 0.95; n = 202).

In a next step, the fitted k and λ parameters were used to randomly generate a generic series of 10,000 annual SY observations for each of the 202 catchments. This was done by inverting Eq. 3 to:

$$ S{Y_p} = \lambda {\left( { - \ln (p)} \right)^{{ \frac{1}{k} }}} $$
(4)

where SY p is the simulated annual SY corresponding with a randomly picked number p between zero and one. Note that this approach assumes that all annual SY values are independent of each other, which is in reality not always the case. However, this simplification is justified since we mainly aim to assess the variability of actual SY values, regardless of the trends or mechanisms behind this variability.

Based on this generated SY p series, potential relative errors of the SYmean were simulated by randomly picking 50 observations out of the simulated series and (similarly to Eq. 2) calculating the relative error as a function of the simulated measuring period:

$$ {\text{Relative}}\,{\text{error}}\,{\text{S}}{{\text{Y}}_{{{\text{mean,}}i}}} = \left( {{\text{S}}{{\text{Y}}_{{{\text{mean}},i}}}-{\text{S}}{{\text{Y}}_{{{\text{mean}},{\text{LT}}}}}} \right)/{\text{S}}{{\text{Y}}_{{{\text{mean}},{\text{LT}}}}} $$
(5)

where SYmean,i is the average of the i first values of the 50 randomly picked SY p values (1 ≤ i ≤ 50) and SYmean,LT is the average of all 10,000 SY p values. For each catchment, this procedure was repeated 1,000 times, which resulted in 1,000 simulated relative errors for each MP. The median of these 1,000 simulated relative errors corresponds with the median relative error that can be expected for a given MP, while the highest and lowest 2.5% of the simulated relative errors indicate the boundaries of the 95% confidence interval of the actual relative error.

3 Results

3.1 Magnitude and controlling factors of the inter-annual variability of sediment yield

Figure 4 depicts the observed inter-annual variability of SY, annual runoff discharge and their corresponding exceedance probabilities, based on all available SY and runoff data. Coefficients of variation for SY (CVSY) range between 6% and 313% (median 75%), while 95% of the catchments have a CVSY between 29% and 230%. CVs of annual runoff depth (CVRunoff) are generally lower, with values ranging between 4% and 278% (median 37%).

Fig. 4
figure 4

Exceedance probability of the coefficient of variation (CV) for all available annual sediment yield (SY) and runoff depth (Runoff) time series

Figure 5 plots the observed CVSY against the average SY value for the entire MP (SYmean) for each catchment. Weak but significant positive relationships (r 2 = 0.13, p < 10−4) were found, indicating that catchments with a large SYmean tend to have a larger inter-annual variability of SY compared to catchments with a small SYmean. No significant relationships were found between CVSY and catchment area (r 2 = 0.01, p = 0.35; Fig. 6). Boxplots of the CVSY for different A ranges also indicate only small differences. Although a slight decrease in the median and mean CVSY can be noted with increasing A for catchments larger than 10 km2, two-sided Wilcoxon rank sum tests between neighbouring boxplots (at a significance level of 5%) only indicated significant differences between catchments with A < 10 km2 and catchments with 10 < A < 100 km2 and between catchments with 103 < A < 104 and 104 < A < 105 km2.

Fig. 5
figure 5

Mean annual sediment yield (SY mean ) versus the coefficient of variation of the annual sediment yield values (CV SY )

Fig. 6
figure 6

Left catchment area (A) versus the coefficient of variation of the annual sediment yield values (CV SY ). The regression is insignificant at a confidence level of 5%. Right boxplots of the CVSY for different ranges of catchment areas

No meaningful relationship was observed between SYmean and the corresponding average annual runoff depth (r 2 = 0.02, p = 0.0002). Nonetheless, a significant relationship was found between CVSY and the coefficients of variation of the corresponding annual runoff depths (r 2 = 0.35, p < 10−4; Fig. 7). P mean showed only very weak negative correlation with CVSY (r 2 = 0.05, p < 10−4). However, a much stronger relationship was noted between the coefficient of variation of the annual rainfall depths (CVP) and CVSY (r 2 = 0.21, p < 10−4; Fig. 8). Note that for one catchment, the estimated CVP was exceptionally low, compared to the CVP of other catchments (see Fig. 8). As this catchment is located on Svalbard (Norway), a relatively data-poor region regarding climatic records, this CVP value was considered to be unreliable and the catchment was excluded from the regression analysis.

Fig. 7
figure 7

Coefficient of variation of the annual SY values (CV SY ) versus the coefficient of variation of the corresponding annual runoff depth values (CV Runoff )

Fig. 8
figure 8

Coefficient of variation of the annual sediment yield values (CV SY ) versus the estimated coefficient of variation of annual rainfall of each catchment (CV P ). The value indicated by a cross was excluded from the regression, since its corresponding CVP was expected to be unreliable (see text)

The relationships between the CVSY and various catchment characteristics for the 65 selected sub-catchments of the Siret basin (see Fig. 1) are shown in Fig. 9. Two of the catchments were known to be affected by reservoirs in their upstream area and were therefore not included in the regression analyses. The results indicate no clear trend between CVSY and catchment area, the average catchment slope or the fraction of arable land. A strong negative correlation was found between the average slope and the fraction of arable land in each catchment (r 2 = 0.81, p < 10−4). Furthermore, no significant relationship was found between SEM and SYmean, while a weak but significant positive correlation was observed between CVSY and SEM (r 2 = 0.10, p = 0.01).

Fig. 9
figure 9

Relationships between the coefficients of variation of annual sediment yield (CVSY) and various catchment characteristics for 65 sub-catchments of the Siret basin in Romania (see Fig. 1). The two catchments indicated with an x are affected by dams and were not included in the regression analyses. All regressions are therefore based on 63 observations (representing 2,477 catchment-years of observations). A catchment area, Slope estimated average slope of the catchment, Arable land estimated aerial fraction of arable land, SEM estimated average sheet and rill erosion rate according to the map of Cerdan et al. (2010), SY mean average annual sediment yield for the whole measuring period

3.2 Relative errors on average sediment yields due to inter-annual variability

Figure 10 displays the relative error of the average SY for a given MP (SYmean,i ) as calculated by Eq. 2 for all catchments with at least 30 years of SY observations (n = 226). As this figure shows, relative errors for SYmean values based on short MPs can be very large. The negative median error further indicates that SYmean values based on a short MP underestimate the long-term average value. However, many of the time series show a very similar evolution of the relative errors as a function of the MP. This can be explained by the fact that many of the catchments are spatially clustered (see Fig. 1) and were monitored during the same period.

Fig. 10
figure 10

Relative error of the average annual sediment yield (SY) for the indicated measuring period (Relative error SYmean,i ) versus the measuring period (MP) for all catchments with at least 30 years of annual SY observations (n = 226). Relative errors were calculated according to Eq. 2. Each grey line represents one catchment. The black lines correspond to the 2.5%, 50% and 97.5% quantiles of the relative errors for the indicated MP, based on all considered catchments

The results of the Monte Carlo simulations are displayed in Fig. 11. Each boxplot in the top figure displays the distribution of the simulated median error for a given MP for all catchments (see Section 2.4), while the boxplots in the lower figure display the corresponding lowest and highest 2.5% of the simulated errors and indicate a 95% probability interval of the relative errors. Similar to Fig. 10, median relative errors are generally negative, while the 95% probability intervals of the errors are generally asymmetrical.

Fig. 11
figure 11

Simulated relative errors on the average annual sediment yield (SY) for the indicated measuring period (MP; see Eq. 5). Each boxplot in the top figure display the distributions of the expected median relative errors for all catchments (n = 202; see text), corresponding to the indicated MP. The boxplots in the lower figure display the corresponding distribution of the lowest and highest 2.5% of the simulated errors and indicate a 95% probability interval of the relative errors

4 Discussion

4.1 Reliability of the obtained results and trends behind the observed variability

Some sources of uncertainty should be considered before discussing the observed results. First, the data of this study were collected by a range of measuring methods, which induce different degrees of uncertainty. For example, the timing and frequency of suspended sediment sampling and the extrapolation procedure used may have a large impact on the obtained yearly SY values. Annual SY values based on low-frequency sampling strategies often underestimate the actual sediment export because important events may be missed. On the other hand, if one of the few sampling periods coincides with a large event, the importance given to this event may be too high. In some occasions, this can lead to an overestimation of the annual SY value. Uncertainties due to low sampling frequencies can be high for SY data that were calculated with discharge-weighted average concentration methods (e.g. Moatar et al. 2006). However, SY data derived from Q–SSC relationships may also be subject to similar errors (e.g. Walling and Webb 1981; Phillips et al. 1999). Furthermore, Q–SSC relationships are not unique but may show important variations within and between events, seasons and years. SY estimations based on this procedure therefore strongly depend on the representativeness of the rating curves used (e.g. Walling and Webb 1981; Phillips et al. 1999; Asselman 2000; Morehead et al. 2003). Since these uncertainties affect the annual SY values, they also affect the inter-annual variations considered in this study. Unfortunately, insufficient data were available to assess the importance of these uncertainties on our results.

Another important source of uncertainty for the results obtained is the presence of dams and reservoirs in some catchments. As indicated by various studies, the numerous dams and reservoirs constructed over the last few decades have had a large impact on the sediment flux of many drainage basins, especially in Europe and the USA (e.g. Vörösmarty et al. 2003; Walling 2006). Although at least 187 of the 726 catchments considered in this study are known to be unaffected by dams or reservoirs, this information is not available for the other catchments. Most likely, a considerable fraction of the other 539 catchments may have reservoirs in their upstream area.

The presence of reservoirs is generally expected to decrease the SY, as a proportion of the sediments are trapped within the reservoirs (e.g. Vörösmarty et al. 2003; Walling and Fang 2003). However, examples have also been reported of reservoirs having no clear impact on monitored SY (e.g. Phillips 2003). Bogen and Bønsnes (2005) even noted an increase in sediment loads downstream of hydropower works in the Beiarelva catchment (Norway). Reservoirs may not only influence the SY but also the variability in SY. However, the impact of reservoirs on the temporal variability of sediment loads is difficult to assess as this depends on the catchment characteristics, the location of the reservoir in the catchment, their operating characteristics and the period in the MP of the SY record at which the reservoir starts functioning. For example, the construction of a reservoir may result in a significant decrease in both the SY and the CVSY of a catchment. However, if the catchment was already monitored before the construction of the reservoir, the sudden decrease in SY may lead to an increase of the CVSY for the complete MP of the SY record.

Although the results of this study are mainly expressed in terms of coefficients of variation, the inter-annual variability of SY behind these coefficients is not always merely a result of random variations. Fluctuations in sediment loads can be the result of specific trends or cycles related to land-use changes, tectonic events, climatic changes, temporal storage of sediments in channels or floodplains or the occurrence of extreme events (e.g. leading to significant changes in channel morphology) during the MP (e.g. Morehead et al. 2003; Walling et al. 1998; Walling 2006; Ouimet et al. 2007; Tote et al. 2011).

Despite these uncertainties, the large number of catchments included in the database and the wide range of their environmental characteristics can be expected to provide a representative sample of the measured variability in sediment fluxes under current conditions in the USA and significant parts of Europe and the Middle East (see Fig. 1). Whereas the aim of this study is mainly to quantify the magnitude of inter-annual variability of measured SY values and to explore the importance of potentially controlling factors, understanding the mechanisms and processes behind this variability requires a more in-depth and catchment-specific approach. A significant part of this variability may be driven by anthropogenic impacts. However, as humans are currently one of the most important geomorphic agents (e.g. Hooke 2000; Vörösmarty et al. 2003; Cendrero et al. 2006; Kareiva et al. 2007), their impact on the inter-annual variability of SY should also be considered.

4.2 Magnitude and controlling factors of inter-annual variability of sediment yield

The results of this study show that inter-annual variability in SY varies significantly between catchments (see Fig. 4). As discussed in Section 1, catchment area, climatic variability and land use may be expected to control this variability. However, our results show very little evidence of this.

No meaningful decrease in CVSY with increasing catchment area could be detected (see Fig. 6). The variability of SY for catchments <10 km2 appears to be even smaller than that for larger catchments. This may be attributable to the fact that 14 out of the 27 catchments in this group are located in the boreal region of Sweden and Finland, where the inter-annual variability was noted to be low compared to other regions. However, even for catchments within the same river basin, no clear trend between A and CVSY was found (see Fig. 9a). Our results therefore concur with the findings of Olive and Rieger (1992) and indicate that catchment area has no strong control on the inter-annual variability of SY.

Likewise, no significant relationship was found between the fraction of arable land and the CVSY for the sub-catchments in the Siret Basin (see Fig. 9c). This is may be explained by the very strong negative correlation between the average catchment slope and the areal fraction of arable land of the catchments (see Fig. 9e). Catchments with a high proportion of arable land are therefore not necessarily more sensitive to erosion. The estimated sheet and rill erosion rates (SEM) consider the combined effect of land-use, topography and soil characteristics (Cerdan et al. 2010). Hence, they provide a more integrated measure of this erosion sensitivity. Indeed, a weak (but significant) positive relationship was found between the average SEM and CVSY of the selected catchments (see Fig. 9d). This suggests that the presence of erosion-prone land-use conditions may indeed contribute to the inter-annual variability of SY. However, the relationship is too weak to draw any hard conclusions.

Although our results confirm the result of previous studies indicating no clear relationship between P mean and CVSY (e.g. Walling and Kleo 1979), a clearly significant positive correlation between CVP and CVSY was found (see Fig. 8). This shows that inter-annual variability in SY is indeed to some extent controlled by climatic variability. Nonetheless, the observed relationship is weak. More detailed rainfall data corresponding to the actual period of SY measurements may further improve this relationship. However, the relationship between CVSY and CVRunoff (see Fig. 7) is also relatively weak. As these CVRunoff values are based on measured data corresponding to the SY measuring period, this indicates that climatic variability at an annual time scale also has only a limited influence on the inter-annual variability of SY.

The generally poor correlations between CVSY and various catchment characteristics mainly illustrate that a large part of the observed variation cannot be accounted for with the indicators considered. An important part of the observed scatter may be attributed to the unknown trends and uncertainties discussed in Section 4.1. Furthermore, other factors that were not considered in this study due to a lack of data (e.g. catchment geology, distance to potential sediment sources) may also explain some of the observed variance. Furthermore, it is well-known that in many cases, the largest fraction of the annual SY is transported during one or a few large flood events (e.g. Markus and Demissie 2006; Vanmaercke et al. 2010), while several case studies illustrate the impact of an exceptional climatic or tectonic event on average SY for longer time periods (e.g. Beylich and Gintz 2004; Bogen 2004; Hovius et al. 2011; Tote et al. 2011). As a result, inter-annual variability in SY is also probably largely controlled by the occurrence of high-magnitude events with a long recurrence interval. This would also concur with the observation that SY time series are generally positively skewed (see Fig. 3).

Apart from the fact that predicting the occurrence of such events is generally very difficult, assessing the effects of these events on SY requires a detailed and spatially explicit understanding about the potential sediment sources and sinks in the catchment. Such knowledge is generally not available for most catchments. As a result, the inter-annual variability in catchment SY remains very difficult to predict.

4.3 Uncertainties on average sediment yields due to inter-annual variability

As discussed in the previous section, inter-annual variability on SY varies greatly between catchments and correlates only poorly with the catchment characteristics considered. As a result, relative errors for average SY values also vary greatly between the catchments and are difficult to predict. Nevertheless, important conclusions can be drawn from the simulated relative errors per MP, based on all catchments with a sufficiently long time record. Both Figs. 10 and 11 clearly illustrate that relative errors for average SY values can be very large for short MPs and are generally asymmetrical. Due to their strong auto-correlation and time-specific nature, the calculated errors of Fig. 10 cannot be used to assess the uncertainty for SYmean values for other catchments. However, the Monte Carlo simulation approach discussed in Section 2.4 avoids these problems and allows the assessment of the order of magnitude of these uncertainties.

The generally negative median relative errors (see Fig. 11 top) imply that average SY values based on short measuring periods are generally more likely to underestimate the actual long-term mean. This underestimation of SY can be considerable, especially for short measuring periods (<5 years). For example, for the median catchment of our Monte Carlo simulations, a SY value obtained after 1 year of monitoring had a 50% probability of underestimating the long-term mean by about 22%. However, these underestimations quickly decrease with increasing MP and after 5 years of monitoring, median underestimations can be expected to be less than −5%.

Also, the 95% error ranges for average SY values are generally not symmetrical (see Fig. 11 bottom). For a median catchment where SY was monitored for 1 year only, the relative error for the long-term mean has a 95% probability of varying between about −95% and +209%. When we solve Eq. 5 to SYmean,LT, this means that when a SY of 100 t km−2 year−1 is measured during 1 year, the 95% confidence interval on the actual long-term SYmean ranges between 32.4 and 2,000 t km−2 year−1. These numbers not only illustrate the large uncertainties for average SY values derived from short measuring periods but they also show the importance of considering the generally skewed nature of SY time series when assessing these uncertainties. The median catchment in our database has a CVSY of 75%. Calculating the corresponding uncertainties for the same SY value and MP using Gaussian statistics yields a clearly different 95% confidence interval of −50 (hence zero) to 350 t km−2 year−1 and fails to identify the generally larger probability of underestimating the long-term mean SY.

Similarly to the median errors, the 95% probability range of relative errors quickly decreases as the MP increases (see Fig. 11). For the same median catchment, the relative error has a 95% probability of ranging between −45% and +57% after 10 years of monitoring. This implies that if 10 years of monitoring results in an average SY of 100 t km−2 year−1, the 95% confidence interval for the actual long-term SYmean would range between 64 and 182 t km−2 year−1.

As can be seen in Fig. 11, errors decrease relatively little when measuring periods increase further. For example, the relative error on the mean SY after 50 years of monitoring for the same median catchment may still be expected to range between −22% and +24%, corresponding to a 95% confidence interval of 81 to 128 t km−2 year−1 for a mean value of 100 t km−2 year−1. This finding agrees relatively well with the studies of Day (1988) and Summer et al. (1992), who suggested that continuing measurements after 10 years contributes only little to the reliability of the measured average SY. Nevertheless, the error ranges indicated in Fig. 11 clearly illustrate that the uncertainties on SYmean values due to inter-annual variability may still be considerable, even after a MP of 50 years.

4.4 Comparison with other sources of error and implications

The uncertainties for SYmean values discussed in the previous section are large. Especially for shorter measuring periods, the potential error due to inter-annual variability may be at least as important as other sources of errors for measured SY data. Based on a detailed dataset of 21 small flood-retention ponds in Belgium, Verstraeten and Poesen (2002) estimated that measuring errors for SY values derived from pond siltation rates ranged between 40% and 50%. For SY values derived from measurements at gauging stations, the total measuring error is often difficult to assess, as this depends on various factors such as the sampling frequency (e.g. Walling and Webb 1981; Phillips et al. 1999; Moatar et al. 2006), the sampling regime (e.g. at regular time intervals or on a flow-proportional basis; Steegen et al. 2000), the sampling location in the river cross section (e.g. Steegen and Govers 2001) and errors for the runoff discharges (e.g. Lohani et al. 2006). However, uncertainties for SYmean values due to inter-annual variability for short measuring periods (<5 years) may be at least as important as these other individual sources of error. For example, Moatar et al. (2006) estimated for 36 rivers in Europe and the USA the effect of infrequent sampling on the reliability of SY values calculated with the discharge-weighted sediment concentration method. They found median relative errors ranging between 0% and −55%, depending on the catchment. This range of error compares relatively well with our median relative errors when SY is monitored for only 1 year (see Fig. 11).

At an intra-annual scale, infrequent but large events are known to strongly control annual SY values (e.g. Markus and Demissie 2006; Gonzalez-Hidalgo et al. 2009; Vanmaercke et al. 2010). SY measurements which are based on relatively low sampling frequencies therefore have a larger probability of underestimating the actual sediment load, as they are more likely to not include the largest events (e.g. Webb et al. 1997; Moatar et al. 2006). Similarly, SYmean values based on a short MP have a relatively larger probability of underestimating the long-term SYmean. However, whereas underestimations due to low sampling infrequencies generally decrease with increasing A (Webb et al. 1997; Moatar et al. 2006), no indications were found that also inter-annual variability is controlled by catchment scale (see Fig. 6).

The results of this study show that average SY values based on short measuring periods should be interpreted with great caution. Not only are they subject to very large ranges of uncertainty, they may also be expected to be biased and underestimate the long-term average SY. Nonetheless, SYmean values based on short measuring periods are common. An extensive review of available SY data in Europe indicated that about half of the 1,287 average SY values measured at gauging stations have a MP <10 years, while 31% of the data have a MP <5 years (Vanmaercke et al. 2011b). SY data derived from reservoir surveys generally cover longer measuring periods. The same literature review indicated that about half of the available 507 average SY values, derived from reservoir surveys, had a MP >30 years, while only 15% had a MP <10 years (Vanmaercke et al. 2011b). SYmean values derived from reservoir surveys with a sufficiently high trapping efficiency also include bedload and are not susceptible to errors due to infrequent sampling. Therefore, they may be expected to give a higher but more accurate estimate of the total SYmean, compared to SYmean values measured at gauging stations.

As indicated in Fig. 11, uncertainties due to inter-annual variability quickly decrease after the first few years of monitoring. However, even after 50 years of monitoring, uncertainties are still considerable. This has important implications for our understanding of sediment dynamics and our attempts to model long-term SYmean values. Many models succeed relatively well in predicting average sediment fluxes based on constant catchment characteristics (e.g. Syvitski et al. 2005; de Vente and Poesen 2005). However, the relatively high degrees of uncertainty for measured SYmean data, even after long measuring periods (see Fig. 11), imply that it is impossible for a steady-state model to account for all observed variation in SY, regardless of the quality of the model, input data or SY measurements. As runoff and rainfall variability control this temporal variability to some extent (see Figs. 7 and 8), models that explicitly consider this variation in runoff and/or climate may be expected to perform better. However, such models face other important challenges, related to accurate process descriptions and input data requirements. Several reviews have indicated that temporally (and spatially) explicit process-based models generally do not perform better than temporally lumped empirical models (e.g. Merritt et al. 2003; de Vente et al. 2008; Govers 2011).

5 Conclusions and recommendations

This study aimed to quantify the magnitude of inter-annual variability of annual SY data, identify factors that control this variability and assess the effect of this variability on the reliability of measured average SY data. Limitations on the available data made it difficult to fully address these research objectives. Firstly, large numbers of SY time series were only available for a few regions, while for many other regions, little or no data were available. Also, the SY data considered were collected by various methods which are subject to various sources of uncertainties. These uncertainties also affect the observed inter-annual variability in SY, but their effect on our results could not be assessed. Furthermore, inter-annual variability may be attributable to specific trends, related to the construction of dams and reservoirs, land-use changes or climatic changes. Identifying such trends was impossible due to a lack of data and was out of the scope of this research. However, the results of this study are based on a vast dataset of 15,025 annual SY observations for 726 catchments, representing a wide range of catchment areas, SY values, environmental conditions and human impacts. Therefore, the results of this study are expected to be representative of the inter-annual variability of measured contemporary suspended sediment fluxes in the USA and major parts of Europe and the Middle East.

Analyses of this large dataset revealed that inter-annual variability in SY is generally considerable, but may differ greatly between catchments (CVSY values ranged between 6% and 313% with a median of 75%). The factors controlling this variability could only partly be identified. Variations in annual runoff, as well as rainfall depth, explains part (<35%) of the observed variability of SY, while catchment area seems to be of very little importance. Our results remain inconclusive about the potential impact of land use on the inter-annual variability of SY, but indicate only a weak correlation.

The difficulties in explaining the inter-annual variability in SY between catchments may be partly attributable to the limitations mentioned above. Furthermore, other factors that were not considered in this study (e.g. catchment geology, distance to potential sediment sources) may contribute to some of the observed variation in inter-annual variability. Nonetheless, the results of this and other studies suggest that the inter-annual variability of SY in a catchment is also strongly controlled by the occurrence of extreme events. Further attempts to quantify the inter-annual variability may therefore benefit strongly from a better understanding of the sensitivity of catchments to such extreme events.

Our lack of understanding about the factors controlling inter-annual variability in SY impedes our assessment of catchment-specific uncertainty ranges on average SY values as a function of their MP due to this inter-annual variability. However, it was possible to calculate statistically robust ranges of these uncertainties, based on the large dataset of time series considered in this study. Figure 11 can therefore be used to estimate the expected range of uncertainty for SY values with a known MP.

In general, the results of this study show that uncertainties for average SY values due to inter-annual variability can be very large, especially when the MP is short (<5 years). Furthermore, these uncertainties are asymmetrical and have a tendency to underestimate the actual long-term mean SY when measuring periods are short. This asymmetric nature of SY time series has been neglected generally by previous studies. The reliability of average SY quickly improves after the first few years of monitoring. However, uncertainty ranges can remain considerably high, even after 50 years of monitoring. As a result, this should be considered in models aimed at predicting the average SY, as well as in decision-making strategies for sediment management.