1 Introduction

Understanding suspended sediment fluxes in the critical zone is crucial for soil and water resource management (Brils 2008). As the critical zone, i.e., “the thin layer of the Earth’s terrestrial surface and near-surface environment that ranges from the top of the vegetation canopy to the bottom of the weathering zone” (Guo and Lin 2016) is increasingly perceived as a highly dynamic entity, better understanding of the prevailing processes can only be achieved with an interdisciplinary approach and by considering spatial and temporal variability (Brantley et al. 2017). Knowledge of sediment provenance in catchments that are prone to erosion is therefore important for two reasons. Firstly, proposing best management practices require applying the right erosion control measures to the right target areas in order to reduce soil loss within catchments or to ensure good ecological status of water bodies as demanded by the European Water Framework Directive (Brils 2008; de Deckere et al. 2011; Perks et al. 2017). Secondly, improving our understanding of processes responsible for sediment transport within the critical zone requires the capacity to analyze the suspended sediment yields with other descriptors than only the hydrograph and suspended sediment concentrations at a single point. Indeed, comparing modeled and observed sediment concentrations at a given outlet does not ensure that the sedimentary dynamics within a catchment are well captured by the model, as various combinations or spatiotemporal patterns of processes can lead to the same downstream results. Thus, alternative strategies for model evaluation are needed (Cooper et al. 2012). In that sense, gaining knowledge on the sediment spatial origin is of crucial interest. This is all the more true in mesoscale catchments (10–1000 km2) for which the level of complexity increases since the high spatiotemporal variability of the meteorological forcing comes to be superimposed with the heterogeneity of soils and the morphology of watersheds. Some studies suggest high temporal variability of source contributions from the event scale (Poulenard et al. 2012; Legout et al. 2013; Cooper et al. 2015) to the decadal scale (Vercruysse et al. 2017) highlighting the need to know the sediment provenance with the best temporal resolution possible.

Several physicochemical properties of suspended sediment samples and their potential sources have been used as tracers or fingerprints, including radionuclides (e.g., Motha et al. 2003; Evrard et al. 2011, 2013; Ben Slimane et al. 2013; Palazón et al. 2016; Huon et al. 2017; Palazón and Navas 2017; Pulley et al. 2017), organic or inorganic geochemistry (e.g., Collins et al. 1997, 2010; Douglas et al. 2009; Evrard et al. 2011, 2013; Koiter et al. 2013; Cooper et al. 2014; Haddadchi et al. 2014; Laceby and Olley 2015; Du and Walling 2017; Huon et al. 2017), magnetic properties (e.g., Walling et al. 1979; Dearing et al. 1981, 1986, 2001; Maher 1986; Yu and Oldfield 1993), particle color (e.g., Martínez-Carreras et al. 2010a, 2010b; Legout et al. 2013; Brosinsky et al. 2014a, 2014b), or composite fingerprints that comprise several of these tracers. Quantitative values for these properties can be derived in different ways; for example, geochemistry can be measured with inductively coupled plasma mass spectrometry (ICP-MS) or derived from X-ray fluorescence (XRF) measurements. Measurement duration and costs can vary substantially between the different methods and can be a considerable limitation, especially when a high-frequency monitoring approach is needed. Besides being inexpensive and rapid, low-cost alternative fingerprinting methods also have the advantage to be non-destructive which is important especially for the suspended sediment samples where often only small sample quantities are available. Mineral magnetic properties were among the first low-cost tracers that were used for the characterization and discrimination of soils and for tracing sediments on slopes and in watersheds from the sources to the deposits. Recently, they have been applied as tracers to track the dispersion of dredge dumped sediment in the bay of the Seine (Nizou et al. 2016), wildfire-affected soils (Blake et al. 2006), and sediment sources of different geology (Pulley and Rowntree 2016). Color tracers obtained from diffuse reflectance spectroscopy have been successfully applied in alternative sediment fingerprinting studies and were shown to be able to discriminate between sources from different land use, geology, or depth (Martínez-Carreras et al. 2010a, 2010b; Legout et al. 2013; Brosinsky et al. 2014a, 2014b; Barthod et al. 2015; Pulley and Rowntree 2016). In such studies, XRF tracers have also been applied successfully (Motha et al. 2003; Cooper et al. 2014; Laceby and Olley 2015). Another advantage of low-cost methods is that they can be used as multi-tracer approaches allowing either an increase of the dimensionality of the data (Lees 1997; Small et al. 2004) or the cross-validation of innovative methods (Pulley and Rowntree 2016).

Several mixing models are available to quantify the contributions of different sources to suspended sediment samples. Chemical mass balance mixing models were the first mixing models applied in quantitative sediment fingerprinting studies (e.g., Peart and Walling 1986; Yu and Oldfield 1989; Walling et al. 1993) and are still widely used (e.g., Motha et al. 2003; Martínez-Carreras et al. 2010a, 2010b; Brosinsky et al. 2014a, 2014b). Bayesian mixing models are increasingly used in sediment fingerprinting studies (Koiter et al. 2013; Cooper et al. 2014; Nosrati et al. 2014, 2018; Barthod et al. 2015). A third approach is offered by partial least square regression (PLSR) where the model is trained with artificial mixtures of known proportions (Poulenard et al. 2012; Legout et al. 2013).

While sediment fingerprinting approaches are becoming widely used, there remain important challenges and uncertainties that have to be considered, due to both the tracer set selections and mixing model approximations (Small et al. 2004; Smith et al. 2015). Source heterogeneity has been identified as a principal cause of error in mixing models (Pulley et al. 2017). Another important source of error is related to the particle size selectivity during erosion and transport processes which results in a violation of the assumption of conservative behavior of the tracers (Laceby et al. 2017). This is a known issue for spectrocolorimetry as grain size is a physical chromophore (Ben-Dor et al. 1998) and was highlighted by Legout et al. (2013) and Pulley and Rowntree (2016) for color tracers and by Laceby et al. (2017) for more conventional sets of tracers. The assumption of conservative behavior of the tracers can also be challenged by biogeochemical alterations during temporary storage of the sediments in the riverbed or while they are suspended in the water (Legout et al. 2013; Vale 2016). As outlined by Martinez-Carreras et al. (2010a, b), Evrard et al. (2013), Pulley et al. (2015), Palazón and Navas (2017), and Nosrati et al. (2018), different results in the sediment source proportions can also be obtained when different tracer (sub)sets or different composite fingerprints are used. Similar contradictory results can also be obtained when different mixing models are used (Haddadchi et al. 2013; Cooper et al. 2014; Laceby and Olley 2015; Nosrati et al. 2018). These latter elements suggest a high sensitivity of the fingerprinting approaches to the tracers and mixing models used. While studies already performed comparisons between mixing models and tracer sets, these latter were only done for the same types of mixing models. Indeed, most of the mixing models are based on a mass balance approach, seeking to solve the same overdetermined system of linear equations. The PLSR models have however a fundamentally different approach, predicting the source contributions with a regression model trained on artificial mixtures. Such models are often associated with alternative tracer sets such as spectral reflectance in the visible and infrared ranges (Poulenard et al. 2012; Legout et al. 2013). To the best of our knowledge, no study reported any comparison of the performance of such contrasted mixing model approach applied with a set of low-cost tracers, neither in terms of the various sources of error nor of the similarity of predicted source proportions.

The overall objective of this study was to quantify the source contributions to suspended sediments in a mesoscale Mediterranean catchment. As the area is prone to intense rain events that can lead to flash floods (Braud et al. 2014; Nord et al. 2017), the hydrosedimentary processes can change significantly between and within events. This motivated the interest to develop alternative low-cost fingerprinting methods, using tracers derived from two portable spectrometers (i.e., X-ray fluorescence and spectrocolorimeter) to ensure a high temporal resolution of the sediments spatial origin within the catchment. The specific questions addressed in this study were as follows: (i) whether low-cost tracers could discriminate between major source of suspended sediments; (ii) to which extent the predicted proportions of source materials differ from mixing models and tracer sets, including associated errors; and (iii) what were the variations of the source contributions between and within runoff events that occurred during the 2011–2017 period in a Mediterranean mesoscale watershed?

2 Methods

2.1 Study site

The 42.3-km2 Claduègne catchment and the nested 3.4-km2 Gazel catchment are research catchments within the Cévennes-Vivarais Mediterranean Hydrometeorological Observatory (OHMCV, Boudevillain et al. 2011, http://ohmcv.osug.fr) which is part of the French network of critical zone observatories (Gaillardet et al. 2018). These study sites aimed at investigating the meteorological and hydrosedimentary processes during heavy rain events and flash floods (Nord et al. 2017). The northern part of both catchments (ca. 51% of the Claduègne and 23% of the Gazel catchment) are located on the volcanic Coiron plateau, a vast basaltic table formed during volcanic activity in the early Pliocene (Fig. 1). The pedology of the catchment is dominated by eutric and andic cambisols on basaltic rocks in the north and more or less developed calcareous soils in the south. The latter range from well-structured cultivated soils to rendzic leptosols, regosols, and fluvisols. A prominent feature of the Claduègne catchment is the presence of sedimentary badlands that are characterized by the lack of vegetation cover, the steep slopes and obvious signs of gully erosion. Even though they cover < 1% of the total catchment area (delineation on orthophotos of IGN France (2009)), they are usually adjacent to the hydrographic network and thus very well connected. The main land use types in the Claduègne catchment are forests (44%), agricultural zones (cultivated fields and vineyards, 24%), heaths (13%), and grasslands (10%). The cultivated fields and vineyards are temporarily bare and are assumed to be a distributed source of suspended sediment. Thus, three sources are assumed to contribute to the suspended sediment samples in the rivers: sedimentary badlands, bare soils on basaltic geology, and bare soils on sedimentary geology. The climate is affected both by oceanic and Mediterranean influences. While the highest average daily precipitations are found at altitude, the highest hourly rainfall intensities are recorded in the plains (Molinié et al. 2012). At the outlets of the two catchments, two hydrosedimentary stations continuously monitor liquid and solid fluxes since 2011 (Nord et al. 2017). Water level is measured with an H-radar at the Claduègne station and a hydrostatic pressure probe at the Gazel station at frequencies of 10 and 2 min, respectively. Water level is converted to discharge based on a stage discharge relation using ca. 20 discharge measurements for each station. At both stations, suspended sediment concentrations are additionally monitored with turbidimeters (Nord et al. 2017).

Fig. 1
figure 1

Study site. a Location of the 56 sampling sites of the source soil samples in the Claduègne catchment (42.3 km2) and the nested Gazel catchment (3.4 km2). The red polygons show the outline of the badlands that were digitized from satellite and aerial images. The geology can be roughly subdivided into the basaltic Coiron plateau in the north and Mesozoic sedimentary rocks (mainly marly-limestones) in the south. The land use data are based on QuickBird satellite images (Andrieu 2015). The class permanently covered comprises forests, permanent grasslands and heaths. b Location of the Cevennes-Vivarais Mediterranean Hydrometeorological Observatory (OHM) in France. The small dot represents the location of the Claduègne catchment

2.2 Sampling

Source samples were collected at 56 locations in the Claduègne catchment (of which 21 are located in the nested Gazel catchment). The sampling locations were chosen for accessibility and to represent the main variability of land use and soil types within each of the three groups assumed to contribute to the suspended sediment samples in the rivers, i.e., sedimentary badlands, bare soils on basaltic geology, and bare soils on sedimentary geology. At each site, one to six subsamples were taken within a radius of ca. 5 m. They were not combined in order to assess small-scale heterogeneity. In total, 178 subsamples were taken as surface scrapes of the top 3–5 cm with non-metallic shovels; 132 of them are taken in areas of the three potential sediment sources (Table 1).

Table 1 Numbers of samples measured and tests conducted for each analytic technique. Tracers in brackets are the ones that were discarded after the respective test

At the outlets of the Gazel and Claduègne catchments, suspended sediment samples are automatically taken every 10 and 40 min, respectively, once a threshold of turbidity and water level is exceeded. For this study, 145 and 179 samples collected during 13 events between 2011 and 2017 in the Claduègne and Gazel catchments, respectively, are considered. For 27 suspended sediment samples taken during five events in the Claduègne catchment, grain size distributions were measured with a laser diffraction sizer (Malvern Mastersizer 2000) after 10 min of sonication and stirring at maximum level in order to destroy aggregates (Grangeon et al. 2012).

Source samples and suspended sediment samples were dried for 24 h at 105 °C, gently crushed and sieved to the particle size fraction < 63 μm.

2.3 Measurements of tracer properties

2.3.1 Spectrocolorimetry

Color measurements were conducted using a portable diffuse reflectance spectrophotometer (Konica Minolta 2600d) that returns the reflectance spectra in the visible range between 360 and 740 nm in increments of 10 nm following Legout et al. (2013). For every sample, the measurement was repeated three times after turning or shaking the tube in order to account for heterogeneity within the sample. The influence of sample quantity on the color tracers was assessed by repeating the measurement after increasing the sample quantity in the box, and it was found that even sample quantities as low as 0.1 g barely influence the measurement.

From the raw spectral data, 15 color coefficients were calculated using the equations given in Commission Internationale de l’Eclairage (CIE) (1978). These include the xyz chromaticity coordinates, three parameters each of the L*a*b* color space and the Hunter Lab color space and 2 parameters each from the CIE 1976 UCS color diagram, the L*C*h* color space, and the L*u*v* color space. Thus, in total, 15 color coefficients were considered as color tracers.

2.3.2 XRF

The measurements were conducted with a portable Bruker Titan XRF analyzer. Using an internal calibration, the device automatically calculates the concentrations of Al2O3, SiO2, P2O5, K2O, CaO, TiO2, V, MnO, Fe2O3, Co, Ni, Cu, Zn, Rb, Sr, and Zr. To account for sample heterogeneity, the measurement was repeated three times for each sample turning the sample support 90° after each measurement.

For practical reasons (availability of the measuring device and the higher sample quantity of about 1 g needed for XRF), only a subset of samples could be measured with XRF (Table 1).

2.3.3 Magnetic susceptibility

Mineral magnetic properties were measured on a subset of 93 source samples and 126 suspended sediment samples at the CEREGE laboratory (Aix Marseille University). For all samples, specific low-field magnetic susceptibility Χlf was obtained from measurements with an AGICO MFK1-FA Kappabridge susceptibilimeter under a frequency of 976 Hz. The measured susceptibility was normalized using sample weight. Χlf values describe the ratio of the induced magnetization of a sample to the intensity of the magnetizing field. It is an indicator for the amount of ferromagnetic minerals (e.g., magnetite or hematite) present in a sample (Maher 1986; Nizou et al. 2016). Mineral magnetic susceptibility was only used in this study to carry out a qualitative cross-validation of the results obtained with the other two tracer sets.

2.4 Tests of assumptions

The ability of the different tracers to discriminate between the three source groups was tested with a Kruskal–Wallis test and by conducting linear discriminant analysis (LDA) using all tracers derived from each measuring technique (spectrocolorimetry or XRF).

In order to test the linear additivity of the tracers, 81 artificial mixtures with known proportions of the three source groups were prepared. First, for all the three sources, a composite sample was made from roughly equal contributions of many individual source samples from the respective group. This was well mixed and the artificial mixtures were prepared by mixing these three poles in known proportions as proposed in Legout et al. (2013).

Spectrocolorimetric measurements were conducted on all the mixtures prepared as well as on the composite samples that represent the poles of 100% of any of the source classes. XRF measurements were only conducted on the three poles and four mixtures. The linear additivity of the tracer properties (15 color parameters for spectrocolorimetry tracers and 16 element concentrations for the XRF tracers) was quantified using the RMSE normalized with the mean of the measured tracer value.

A range test was conducted for every tracer property to check whether the suspended sediment samples were comprised within the range of the values measured for the source samples in order to detect problems concerning incomplete source sampling, conservative behavior of the tracers or linear additivity (Walden et al. 1997; Collins et al. 2017). A small tolerance of less than 5% of the mean of each tracer was applied and tracers for which the range test was not passed were excluded from the mixing models.

As outlined by Phillips et al. (2014), this univariate range test is a necessary but not a sufficient condition for mixing models to work. In order to ensure that the sediment samples can be represented as a combination of the sources, they also have to fall within the multi-dimensional convex hull spanned by the tracer values of the sources. Thus, a convex hull range test was additionally conducted by combining the tracer properties one-by-one, determining the 2d convex hull spanned by the sources and by checking whether the sediment samples lie within the hull.

In order to assess the influence of particle size on the color tracers, we sieved some source samples from the Claduègne catchment (n = 14, 4 badlands, 5 basaltic soils, 5 sedimentary soils) to the size fractions > 500 μm, 200–500 μm, 100–200 μm, 63–100 μm, 40–63 μm, 20–40 μm, and < 20 μm. The spectrocolorimetry tracers were determined and compared for all these samples.

To evaluate the potential effect of biogeochemical alterations during transport and temporary storage, an in situ biogeochemical experiment was conducted as described by Legout et al. (2013). Four composite samples of the < 63 μm fraction of 4 to 11 individual samples from the same geology and land use (badland, cultivated fields on basalt, vineyard on sedimentary geology) were produced. Each one was divided into subsamples which were put into small bags of two layers of porous mesh with a mesh size of 20 μm. Each subsample contained about 1 g of material. All bags were immersed in the river in April 2017 and after immersion times of 1, 3, 7, and 22 days, two replicates of each composite sample were collected. No significant rainfall–runoff events occurred during the experiment. All subsamples were dried at 105 °C for 24 h, gently crushed and weighed to check for weight loss. Spectrocolorimetric measurements were conducted, and the influence of immersion time in the river on the color tracers was assessed by comparing the tracer values for the different immersion times.

2.5 Source quantification with mixing models

2.5.1 NNLS model

For every sediment sample, a system of linear equations based on a chemical mass balance can be set up as: A × c = s where A(nxm) is the source matrix, m is the number of sources, and n is the number of tracers; ai,j being the matrix element giving the value of tracer j for source i. c is the unknown contribution vector that gives the contribution of each one of the m sources to the respective suspended sediment sample. s is the sediment sample vector that gives the measured values of the n tracers for the respective sediment sample. As this system of linear equations is usually overdetermined and there is no unique solution for c, it is approximated with the least squares method. In order to prevent the prediction of negative contributions, the model is constrained to non-negativity. The non-negative least squares (NNLS) algorithm implemented in the R function nnls{nnls} (Mullen and van Stokkum 2015) following Lawson and Hanson (1974) was used. Besides the constraint for non-negativity, the model can also be constrained in a way that the sum of the predicted contributions adds up to 100%. In this study, this constrained was not applied so that the test whether the contributions sum up to approximately 100% was used to detect problems in the fingerprinting approach.

2.5.2 Bayesian mixing model (SIMMR)

The Bayesian mixing model implemented in the R package simmr (stable isotopes mixing models in R; Parnell 2016) was used. It calculates a high number (default: n = 10,000) of plausible solutions of source contributions to each sediment sample using Bayes theorem:

$$ P\left(A|B\right)=\frac{P\left(B|A\right)P(A)}{P(B)} $$
(1)

where the posterior P(A|B) is the contribution of a source to the sediment sample. The prior P(A) is an initial guess of the contribution, which is randomly drawn from the Dirichlet distribution. Thus, the source contributions are independent from each other but sum up to 100%. B is the support knowledge that is provided to A and that is given by the measurements of the tracer properties for the sediment sample and the sources. The model is fitted with a Monte Carlo Markov Chain algorithm that produces plausible solutions for each source’s contribution to each sediment sample (Parnell et al. 2010, 2013; Cooper et al. 2014). From these n realizations, the best estimate (mean or median) and an estimate for the uncertainty (standard deviation) can be derived.

2.5.3 PLSR mixing model

PLSR is a multiple linear regression method that is commonly used in chemometrics for predicting a depended variable (response) from a set of predictor variables. Unlike other linear regression models, PLSR can deal with highly correlated, noisy and numerous predictor variables that are consequently not independent from each other and potentially redundant (Wold et al. 2001).

Unlike the other two mixing models, the model is trained with artificial mixtures of known proportions of the possible sediment sources that were prepared as described in Sect. 2.4. As in Poulenard et al. (2012) and Legout et al. (2013), individual models were set up for each of three sources. In this way, the source contributions were not forced to sum up to 100% for each sediment sample and the test whether or not that sum is close to 100% allows to detect problems in the fingerprinting approach. The models were fitted in R with the function plsr{pls} (Mevik et al. 2016) using six components.

When applying the model to the color tracer set, the dataset was split into a training and a testing dataset (two thirds and one third of the data, respectively) in order to check for overfitting and whether the model was able to predict the proportion of mixtures that were not used to set up the model. As the XRF measurements were only conducted on four artificial mixtures and the three poles, this validation step was not undertaken.

2.6 Error assessment

2.6.1 Source heterogeneity

Source soil heterogeneity is treated differently in the three mixing models. Whereas it is smoothed out in the NNLS and PLSR mixing models, it is explicitly taken into account in the Bayesian SIMMR mixing model. The latter uses the mean and the standard deviation (SD) of each tracer property for each source as model input. Thus, for SIMMR, the variability of model output is calculated and the mean of the SD obtained for all the sediment samples is given as an estimate of uncertainty due to source heterogeneity in every source category. In the NNLS mixing model, the source matrix A is initially parameterized with the mean of all samples in the respective source group for each tracer property. The potential error due to within-source group heterogeneity was assessed with a Monte Carlo resampling algorithm (e.g., Franks and Rowan 2000; Krause et al. 2003) using the SD of the predicted source contribution averaged over all suspended sediment samples as a measure of uncertainty due to source heterogeneity. In the PLSR mixing model, source heterogeneity is also eliminated by creating a composite sample of the source soils in the respective source category and using this composite for the creation of the artificial mixtures. Here, the potential error due to source heterogeneity is assessed by running the model on the source samples that belong unequivocally to one of the three source groups. Due to source heterogeneity, the signature of the individual samples will differ from the composite sample of the group. Hence, the predicted contributions will vary from 100% or 0% and the deviation of the predicted contribution from the real contribution (either 100% or 0%) was used to quantify this kind of error for each source:

$$ {\varDelta}_{sh}=\frac{1}{n}\ {\sum}_{i=1}^n\mid {C}_{real,i}-{C}_{pred,i}\mid $$
(2)

where n is the number of source samples in the respective source category, creal,i is the real source contribution, and cpred,i is the predicted contribution of source category to the source sample i.

In order to obtain a measure that is comparable between the three mixing models, this specific procedure is also applied with the NNLS and the SIMMR mixing models.

2.6.2 Tracer non-conservativeness: particle size

None of the three mixing models takes this source of error into account. It was assessed for the three mixing models run with color tracers for the Claduègne catchment on the 14 source samples sieved to different particle size classes. The true contribution of the source categories was again either 100% or 0%. The deviation of the prediction of the fraction < 63 μm from 100% or 0% was assumed to be due to source heterogeneity whereas particle size was assumed to be responsible of the deviation of the other size fractions from the fraction < 63 μm. In order to quantify this source of error in a way that allows for comparing between the mixing models, the difference between the < 63 μm and the < 20 μm fractions was calculated for every source category:

$$ {\varDelta}_{ps}=\frac{1}{n}\ {\sum}_{i=1}^n\mid {C}_{<63,i}-{C}_{<20,i}\mid $$
(3)

where n is the number of sources samples sieved to < 20 μm in the respective source category, c<63 is the predicted contribution of the source category to the fraction < 63 μm, and c<20 the contribution to the fraction < 20 μm. The < 20 μm fraction was chosen for this analysis, as this fraction was found to be the dominant size class of the suspended sediment samples. The ratio of the fraction < 20 μm to the fraction < 63 μm ranged from 0.69 to 0.91 with a median of 0.78 for the 27 suspended sediment samples from the Claduègne where grain size distributions were measured.

2.6.3 Tracer non-conservativeness: biogeochemical alterations

In order to quantify this source of error, the three mixing models run with the color tracers were applied to all the samples that were immerged in the river for different durations. The difference in predicted contributions before and after immersion in the river Δbgc was calculated for each source and each mixing model:

$$ {\varDelta}_{bgc}=\frac{1}{n}\ {\sum}_{i=1}^n\mid {C}_{0d,i}-{C}_{1d,i}\mid $$
(4)

where n is the number of samples for each source category, C0d is the predicted contribution of the respective source to the composite sample before immersion in the river and C1d is the predicted contribution after immersion in the river for 1 day. The immersion time of 1 day was chosen because it was found that the greatest change in tracer properties occurred already after 1 day whereas they remained stable afterwards. This is also the most likely maximum time of immersion in the river given the size of the catchment and the hydrological concentration time of a few hours.

2.6.4 Testing the mixing models with the artificial mixtures

Besides their necessity for training the PLSR mixing model, the artificial mixtures were also used to test the predictive power of the three mixing models. The NNLS and the SIMMR mixing model were set up independently of the mixtures, so the models were tested on all mixtures (81 mixtures for color tracers, four mixtures and three poles for XRF tracers). For the PLSR mixing model, the third of the mixtures that was not used for model training was used for testing it. The root mean squared error of the prediction (RMSEP) was calculated for each mixing model and each source from the known and the predicted proportions of the source classes.

3 Results

3.1 Verification of fingerprinting assumptions

Both tracer sets were able to discriminate between the three source groups. The main discriminating tracers were L*, a*, and b* for the spectrocolorimetry and Al2O3, SiO2, CaO, and Fe2O3 for XRF (Electronic Supplementary Material S1). When linear discriminant analysis was conducted with either all color tracers or all XRF tracers, all sources were correctly classified in the cross-validation.

With some exceptions, the linear additivity of the tracers was confirmed with the artificial mixtures. The normalized RMSE of the color tracers ranged between 0.2 and 6.3%, and the one of the XRF tracers ranged between 1.1 and 16.8%. In the XRF tracer set, the concentrations of P2O5, Cu, Y, and Zr had values for nRMSE > 10%. Because of this result, these four tracers were removed from the tracer set before the application of the mixing models.

When the univariate range test was conducted for the color tracers using only the source samples sieved to < 63 μm, the two color parameters L and L* failed this range test. Thus, the univariate range test was repeated including the sources that were sieved to < 20 μm which resulted in all color tracers passing the range test. Because of the results of the univariate range test, the samples < 20 μm were included in the pairwise convex hull range test. When a small tolerance of < 5% of the range of each tracer was included in the test, all pairwise combinations passed the test.

When the 16 XRF tracers were considered, the concentrations of P2O5 and K2O did not pass the univariate range test with a tolerance of < 5% of the range. K2O was removed from the tracer set in addition to P2O5 that was already excluded after the test for linear additivity. With the remaining 14 XRF tracers, the pairwise convex hull range test was conducted. The combinations that did not pass the test were the following: Zr combined with six other concentrations and Co combined with SiO2. Thus, Zr and Co were discarded (Table 1).

Concerning the potential effect of particle size, the values of the L* parameter decreased with increasing particle size in a relatively constant manner for the different samples (see Electronic Supplementary Material S2). This effect could explain the fact that L* failed the range test when considering source soil particles < 63 μm while it passed when source particles < 20 μm were considered. This would suggest that the suspended sediment particles were enriched in particles < 20 μm in comparison with the source soils, which was also consistent with the particle size measurements done on some suspended sediments (Sect. 2.2). For the a* parameter, the particle size effect was not systematic, notably for the badlands where the values were relatively independent of particle size. For only three tracers, h*, u’, and x, there was hardly any effect of particle size on the tracer values.

While particle size affects some tracer values (e.g., L*, L, b, v*), others are less dependent on particle size which means that this effect might be smoothed as well as exacerbated in the final predictions performed on suspended sediment samples. Thus, the error that is introduced by this effect has to be assessed in the whole fingerprinting approach. This was quantified in Sect. 2.6.2 and taken into account in the interpretation of the results.

The in situ biogeochemical experiment allowed analyzing the influence of immersion in the river on the color tracers. This effect was less important than the one of particle size. The changes were most important during the first day while all tracer values remained constant for longer immersion times. Even if the maximum immersion duration did not last more than 22 days, this is reassuring that the longest storage durations in the river did not affect the color parameters. The changes on the first day might also be due to the loss of fine particles through the bags with mesh size of 20 μm. However, weight loss of the bags remained very small with values ranging from 0.5 to 3%. Weight loss did not increase with immersion time, so it occurred already during the first day. The impact of immersion on the tracers varied for the samples and the parameters. The basaltic samples changed most while the impact was least for the badlands. The most sensitive parameters were b, u* and v* but none of the parameters changed more than 10% for any sample and the median changes were < 4% for all parameters.

3.2 Comparison of the mixing models

As a first step, the three mixing models were run with the two tracer sets on the artificial mixtures in order to calculate the contributions of the three sources basaltic bare soils, sedimentary bare soils, and marly badlands. The models performed relatively well and could reproduce the known source contributions with RMSE below 7% source contribution with the exception of the SIMMR model run with XRF tracers (Electronic Supplementary Material S3). This model failed to correctly reproduce the source contributions of mixtures with a high contribution of the sedimentary source, which were falsely predicted as a mixture of badlands and sedimentary sources.

When the models were applied to the 145 suspended sediment samples this error increased (Fig. 2). Using color tracers, the SIMMR and the NNLS mixing model gave very similar results (Fig. 2b, RMSE of 4.8%). They agree on the mean source contributions and the correlations were high for all three sources. Also, the results obtained with PLSR agreed well with the other two models with the exception of the flood event that occurred on August 19th, 2014 (Fig. 2a, c, RMSE of 8.2% and 9.5% when these samples were not included). Using XRF tracers, all three mixing models agreed that the contribution of the basaltic sources to the suspended sediment samples in the Claduègne catchment was very low (< 10%). The NNLS and SIMMR mixing models further agreed that the badlands were the dominant source and the two models correlated very well for the sedimentary and badland sources (Fig. 2e). The PLSR mixing model predicted approximately the same mean contribution of the sedimentary and the badland sources. Thus, there was a systematic difference between the results obtained with the PLSR mixing model and the other two models in so far that the two latter models predicted a considerably higher contribution of the badlands and a lower contribution of the sedimentary sources. There was, however, a high correlation between the results obtained with PLSR and with the other two models for the badland and sedimentary sources (Fig. 2d, f) so the within- and between-event dynamics of the source contribution were similar for all three mixing models.

Fig. 2
figure 2

Comparison of the source contributions (in percent) to suspended sediment samples (n = 145 for spectrocolorimetry, n = 35 for XRF) from the Claduègne catchment predicted with color tracers (ac) or XRF tracers (d, e) and different mixing models (NNLS non-negative least squares, SIMMR Bayesian stable isotope mixing model in R, PLSR partial least squares regression). The encircled samples represent suspended sediment samples taken during the event of August 19, 2014

3.3 Comparison of the tracer sets

In order to assess the effect of the choice of tracer sets on predicted source contributions, the results obtained with the two tracer sets were first compared for the artificial mixtures and then for the suspended sediment samples of the Claduègne (Fig. 3).

Fig. 3
figure 3

Correlations of predicted source contributions of the artificial mixtures (ac) and the suspended sediment samples of the Claduègne catchment (df)using the same mixing model but different sets of tracers (color tracers or XRF tracers). The gray dashed line is the identity line

As the models performed well on the artificial mixtures, the tracer sets agreed on the predicted source contributions of the artificial mixtures when the NNLS and the PLSR mixing model were used (Fig. 3a, c). With the SIMMR mixing model, there were considerable differences between the two tracer sets (Fig. 3b) due to the bad performance of the SIMMR model driven with XRF tracers.

The differences in the predicted source contributions were much more pronounced when the suspended sediment samples were considered instead of the artificial mixtures. The correlations of the predicted contributions of the badlands were poor for all models and the mean RMSE was high (Fig. 3d–f). For the sedimentary sources, the correlations were also poor and there was also a high mean RMSE of 22%. Using the PLSR model, the predictions obtained with the XRF tracers were systematically higher than the ones obtained with the with the color tracers. The mean RMSE for the basaltic samples was 9%, but considering the low predicted contributions of the basaltic sources this value was large. There was some correlation between the two tracer sets when the PLSR model was used, but also a systematic difference in so far that the contributions predicted with the XRF tracers were always lower than the ones obtained with the color tracers. The poor correlations of the results obtained with the two tracer sets led to the within and between event dynamics being represented differently depending on which tracer set was used (Fig. 4).

Fig. 4
figure 4

Time series of discharge and raw turbidity [g/l SiO2] for a flood in 2013. The colored bars represent the source contributions of the suspended sediment samples predicted with the three mixing models using either the color tracers (a) or the XRF tracers (b)

Despite the poor accordance of the color tracers and the XRF tracers for single suspended sediment samples and the different prediction of within- and between-event dynamics, the two tracer sets agreed that the badlands were the main source of suspended sediment and that the contributions of the basaltic and sedimentary sources were rather small for that specific rainfall runoff event.

Owing to the large differences of the results obtained with the two tracer sets, particularly for the basaltic contributions (from 5 to 10% on average for color tracers, from 0 to 2% for XRF), the measurements of magnetic susceptibility were used in order to assess in which tracer set to trust more. If the sediments really originated almost exclusively from the badlands, the values for Xlf measured for these suspended sediment samples should be close to the values measured in the badlands (mean ± SD 5.39 ± 4.03*10−8 m3 kg−1, supplementary material S1). The measured values of the sediment samples of this event ranged from 19 to 88 × 10−8 m3 kg−1, with a mean and standard deviation of 57 and 19 × 10−8 m3 kg−1, respectively. Thus, they were considerably higher than the values of the badland source, slightly smaller than the values of the sedimentary sources (75.92 ± 79.40 × 10−8 m3 kg−1) and orders of magnitude smaller than the basaltic sources (1323 ± 551 × 10−8 m3 kg−1).

In order to assess more quantitatively the relation between predicted source contributions obtained with the two tracer sets and the Xlf values, Xlf was calculated as \( {X}_{lf, calc}={\sum}_{i=1}^s\left({X}_{lf,{source\ mean}_i}\ast {c}_i\right) \), where s is the number of sources (s = 3), Xlf, source meani is the mean of the measured Xlf values of source i (Table S1, Electronic Supplementary Material), and ci is the contribution of source i predicted with the respective model. This was done with the three mixing models for the 35 sediment samples for which Xlf, XRF tracers and color tracers were available. The calculated values were compared to the measured values (Fig. 5). When the XRF tracers were used, the measured and the calculated Xlf values were either not correlated at all (SIMMR and PLSR mixing models) or even negatively correlated (NNLS model) which is not plausible at all (Fig. 5b). The correlations were better when the color tracers were used (Fig. 5a), especially with the PLSR model, indicating that the relation between the measured Xlf values of the sediment samples and the source contributions predicted with the mixing models were more plausible. However, the systematic overestimation of calculated Xlf values in Fig. 5a might be due to non-conservativeness (e.g., oxidation of magnetite present in the basaltic source leading to lower measured magnetic susceptibility) or non-additivity of the tracer or to a wrong estimation of the mean value for each source. The latter is certainly possible given the high within source variability. This is especially pronounced for the basaltic source where the natural variability of this parameter is the same order of magnitude as the one resulting from the variations of concentration. A further factor is the large difference (two orders of magnitude) between the measured values for basalts and the other sources.

Fig. 5
figure 5

Calculated magnetic susceptibility (Xlf) of suspended sediment samples (n = 23) against measured ones. The calculated values were obtained using the predicted source contributions when using the color tracer set (a) or the XRF tracer set (b) and the different mixing models. The gray dashed line is the identity line

3.4 Errors of the fingerprinting approaches

Errors due to source heterogeneity, tracer conservativeness and model structure were quantified as described in Sect. 2.6 and summed up in Table 2. For all three groups, the errors varied strongly between tracer sets, mixing models, and sources.

Table 2 Estimates of error (in % source contribution) of the three mixing models applied with the color and XRF tracers

Source heterogeneity

Comparing the error due to source heterogeneity between the two tracer sets showed that the one of the XRF tracers was higher than the one of the color tracers with the exception of the basaltic sources. When the sources were compared, it can be seen that source heterogeneity was generally most pronounced in the sedimentary bare soils and smallest in the badland samples. The source heterogeneity of the basaltic sources varied between the mixing models. They could be unambiguously differentiated from the other sources, but they were also a highly variable source (Electronic Supplementary Material S1). The PLSR mixing model seemed to be more sensitive to this within-source variability than the other two models as Δsh was high for both catchments and both tracer sets (Table 2).

Tracer conservativeness

The impact of immersion in the water on the predicted source contributions was small (Δbgc < 5%) with the exception of the contribution of basalt predicted with the PLSR model (Table 2). Here, the difference in the predicted source contribution between the sample immersed for 1 day and the original one was > 10%. A particular susceptibility of the basaltic samples to changes in the source prediction on immersion in the river was not confirmed by the other models, however.

The effect of particle size selectivity on the predicted source contributions was much more important than the one of biogeochemical alterations upon immersion in the river as Δps was much higher than Δbgc for all sources and all mixing models (Table 2). The mixing models did not agree whether one source was particularly susceptible to the effect of particle size, but almost all values for Δps were > 10% and could be up to > 35% for the basaltic sources predicted with the PLSR mixing model. This was also coherent with the need to including fine source material (< 20 μm) in the range test. Thus, knowing that the sediments are enriched in fine material, source contributions predicted for the sediment samples can be systematically over- or underestimated.

Model structure

The error of the mixing models was quantified with the RMSE of the prediction of the artificial mixtures. When the color tracers were used, all models perform well on predicting the contributions of the three sources with RMSEP < 10% for all sources and models (Table 2). The PLSR model that was trained on two thirds of the artificial mixtures performed especially well on the remaining third of the data (RMSEP < 5% for all sources). Using the XRF tracers, the SIMMR model failed to correctly predict the source contributions of the mixtures, notably the one of the sedimentary sources.

Errors of the NNLS and PLSR model were also evaluated by summing up the predicted source contributions and checking whether the sum was close to 100%. Using the color tracers and the NNLS model, the sum of the predicted contributions of none of the suspended sediment samples and none of the artificial mixtures exceeded 110% or was below 90%. The PLSR model performed slightly worse with 9 out of 145 suspended sediment samples summing up to 110–120%, but still, the majority of the samples summed up to values very close to 100%. Using the XRF tracers, both models performed equally well in the Claduègne catchment and on the artificial mixtures.

4 Discussion

4.1 Performance and errors of the various fingerprinting approaches

Significant differences in predicting source contributions were put forward in this study due to the choice of tracers (Fig. 3) and models (Fig. 2). Such findings were already reported in a few studies. Concerning the choice of a tracer set, Martínez-Carreras et al. (2010a, 2010b) and Evrard et al. (2013) found that alternative tracers (e.g., color tracers and diffuse reflectance infrared Fourier transform spectroscopy) and conventional tracers did not agree on the main sediment source in all cases. Pulley et al. (2015) compared fingerprinting results obtained with magnetic tracers, geochemical tracers, radionuclides, and combinations of these groups and found very important variations in mean contributions of three sources.

Concerning the choice of mixing model, the result that the NNLS and the SIMMR model generally resembled each other while the PLSR differed from the other two models is not surprising as it has a fundamentally different model setup. The NNLS and the SIMMR mixing models are both based on a mass balance approach, seeking to solve the same overdetermined system of linear equations while the PLSR model is based on artificial mixtures. While some studies already performed some comparisons between mixing models, these latter were only done for approaches similar to NNLS and SIMMR. Cooper et al. (2014) and Nosrati et al. (2014) obtained considerable differences in mean source contributions and in the widths of confidence intervals using different mixing models. Haddadchi et al. (2013) compared several variants of the NNLS mixing model and observed high differences in the source contributions predicted in two catchments. Thus, the comparison done in this study, adding a third mixing model with a different approach (i.e., artificial mixtures combined to PLSR), suggests that the differences in the prediction of source contributions due to the choice of a mixing model might be more important than the differences reported in the recent literature.

Among the various sources of errors considered in this study, the ones due to source heterogeneity and particle size were the most important ones. The high source heterogeneity of the sedimentary sources was an expected result as they are very heterogeneous both in terms of land use and soil type. Moreover, some soils are poorly developed and might resemble the badlands. Soils close to the basaltic plateau or the soils on pebble deposit of basaltic component might contain basaltic elements. The lower source heterogeneity in the badland samples was not surprising either, as the badlands could be clearly distinguished from the other sources and resemble each other. Δsh, i.e., the measure of error due to source heterogeneity introduced here, is an effective measure to quantify this effect regardless of the mixing model and was found to be significant.

The observation that particle size effects were more important than the ones of biogeochemical alterations is consistent with the results obtained by Legout et al. (2013) who also quantified both effects. Both effects were found to be in the same order as the results obtained by these authors. The error due to biogeochemical alteration during immersion in the river was considered negligible when compared to the other sources of error. The sufficiently conservative behavior of color tracers and tracers from the infrared spectrum upon immersion in another Mediterranean river was also demonstrated by Legout et al. (2013) and Poulenard et al. (2012). This is promising and justifies the application of the sediment fingerprinting approach in our study site. In larger catchments, however, where longer storage durations in the river bed have to be assumed, this source of error can be important (Vale 2016). The error of the model structure that was quantified as the RMSE of the prediction varied strongly between the models and the sources.

It should be stressed that the different errors estimated in this study were not completely independent from each other, e.g., the failure of the SIMMR model driven with the XRF tracers to reproduce the sedimentary sources was reflected in Δsh and in the RMSEP that were both high for this model. Thus, the different sources of error could not be summed up to obtain a cumulative error. For the majority of the models and sources the maximum estimated error was below 20%. For some models and concerning cumulative errors, however, this value could be exceeded.

Many sediment fingerprinting studies only give the mean SD or other measures for dispersion in the obtained solutions as estimations of the error. The results obtained here indicated that this value was often rather small when compared to other sources of error, so the overall error of the fingerprinting approach is likely to be underestimated. Moreover, it did not include other sources of error than model structure and source heterogeneity. Here, the most notable was the one due to particle size selectivity during erosion and sediment transport that creates systematic errors (over- or underestimation of source contributions).

These results also emphasized the importance to validate mixing models with artificial mixtures, to further address particle size issues and to carefully assess different sources of error. Another simple control procedure proposed by Poulenard et al. (2012) and successfully applied by Legout et al. (2013) and in this study is to not constrain the mixing model to sum up to 100%. This allows detecting problems associated to missing sources or uncertainty introduced during erosion processes and sediment transport. In our study, this test was reassuring as it suggested that all relevant sources were sampled and that the errors discussed above did not lead to the prediction of completely unrealistic source contributions.

As the sensitivity of the mixing models and tracer sets to the different types of error was very heterogeneous, using only one tracer set and one mixing model could give faulty results that are biased by a certain source of error. Thus, this study highlights that there is a strong interest to compare different tracer sets and models and to use multi-tracer/multi-model ensemble predictions to obtain more robust results.

4.2 Interests of using multi-tracer model ensemble predictions to detect main sources, within- and between-event variability in a mesoscale catchment

4.2.1 Main sources

In the Claduègne catchment all mixing models and tracer sets agreed that the badlands were the main source of suspended sediment sampled at the outlet (Fig. 6a). The contributions of this source averaged over 11 events from 2011 to 2017 ranged between 74 and 84% depending on the mixing model and the tracer set used. They also agreed that the mean contributions of the basaltic sources were small (1–8%), and the ones of the sedimentary sources ranged between 12 and 29%.

Fig. 6
figure 6

Mean source contributions of the suspended sediment samples of the Claduègne (a) and Gazel (b) catchments

In order to assess to which extent the fingerprinting approach designed at the mesoscale of the 42-km2 Claduègne catchment would be able to work correctly in a smaller subcatchment, we applied the six model/tracers combinations to the suspended sediment collected at the outlet of the Gazel (3 km2). As the Gazel subcatchment comprised no sedimentary badland areas (Fig. 1), it was expected that the sediment samples were constituted of a mixture of the basaltic and the sedimentary samples. As can be seen in Fig. 6b, this was not the case at first sight. Even though the predicted mean source contributions of the sedimentary badland source were smaller than in the Claduègne catchment, the badlands remained the main predicted source for five of the six ensemble predictions. This perturbing finding was a good example of fingerprinting approaches giving results that apparently contradict to physical reasoning without necessarily hinting at problems in the model set up. Indeed, the sum of contributions was close to 100% for these five ensemble predictions. Only one of them, the XRF-PLSR, predicted mainly sedimentary sources with sum of mean predicted source contributions exceeding 100% considerably. Out of the 20 tested suspended sediment samples the sum exceeded 110% for ten samples and was higher than 140% for eight samples with a maximum of 189%. The fact that one prediction differs significantly from the others emphasized the need of multi-model and multi-trace approaches as it can help to detect problems in the overall fingerprinting approach.

In order to understand the perturbing finding of sedimentary badlands being predicted in the suspended sediments of the Gazel despite their absence in the catchment, the catchment had to be regarded in detail. In some reaches, the riverbed is deeply incised into the marly-calcareous rocks. The thin erodible strata of marls could represent a highly connected source of fine material. Even though this source is very small in area, it might be an important sediment source. Thus, for the Gazel catchment, new mixing models were set up with the three sources (basaltic bare topsoils, sedimentary bare topsoils and eroded riverbanks in marly-calcareous rocks). The methodology used was identical to the one of the Claduègne and the estimates of the error were in the same order as the ones reported for the Claduègne catchment. Also, the comparison of mixing models and tracer sets gave similar results.

The results of the mean source contributions predicted by these new models are shown in Fig. 7. A first striking result is that the new predicted proportions were not so different from those predicted initially in Fig. 6b, considering that badland contributions were replaced in similar proportions by marly calcareous eroded riverbanks. This result is consistent with the fact that the mean colorimetric signatures for eroded riverbanks (e.g., L* = 61.44, a* = 2.88, b* = 14.91) were almost identical as those for sedimentary badlands shown in supplementary material S1. This was also the case, albeit to a lesser extent for XRF tracers (e.g., CaO = 23.08, Fe2O3 = 1.51). A second aspect is that there were some discrepancies in the prediction of source proportions (< 20%) between the tracer sets. With the color tracers, the eroded riverbanks were predicted to be the main source of suspended sediments ranging from 48 to 65%. With the XRF tracers, the mean contribution of this source was predicted to be lower and similar to the mean contribution of the sedimentary sources (49–51%). The mean contribution of the basaltic sources also varied between the two tracer sets. With the color tracers, it ranged between 21 and 30%, i.e., higher than the contribution of the sedimentary samples, while it was much lower with the XRF tracers (6–9%). These absolute differences of less than 20% on average have to be considered in the interpretation of the fingerprinting results, suggesting again the need to perform ensemble predictions obtained from various tracer sets and mixing model approaches.

Fig. 7
figure 7

Mean source contributions of the suspended sediment samples of the Gazel catchment

Of course, the source class eroded riverbank on marly-calcareous rock is also present in the Claduègne catchment. As the fingerprinting properties of this class were very similar to the ones of the sedimentary badlands, the two classes could not be discriminated and the contribution of riverbanks were included in the badland source but were assumed to be of minor importance given their small extend compared to the badland areas. The finding that the badlands were the main contributing source for the Claduègne catchment despite their small area was consistent with the results of Brosinsky et al. (2014a) and Palazón et al. (2016). These latter found that the badlands, which cover less than 1% of the surface of the Barasona reservoir catchment in the Spanish Pyrenees, were the main contributing source of suspended sediments in the reservoir. Given the high erodibility and good connectivity of this source, this result was not surprising. The low contribution of the basaltic sources to the suspended sediments of the Claduègne catchment despite the large surface of this source, suggested either a low erodibility of these soils or a lower connectivity of the erosion zones to the river network.

4.2.2 Within- and between-event variability

Figure 8 shows the mean predicted source contributions for 11 floods in the Claduègne catchment obtained with the different mixing models. The contributions of all three sources varied between events, but there was no apparent seasonal variability. The between event variability seemed to be much higher when the XRF tracers were used than with the color tracers. This might, however, be an effect of sample size as much less samples were analyzed with XRF, so within event variability could not be evened out as much as with the color tracers. Indeed, looking only at events for which the sample size was more than five, led to results that were more consistent between color and XRF tracers.

Fig. 8
figure 8

Between event variability. Mean source contributions for 11 events in the Claduègne catchment predicted with the color tracers (top row), the XRF tracers (bottom row), and the three mixing models. The numbers in the upper part of the bar give the number of samples analyzed per event

The event occurring on August 19, 2014, stood out for the high predicted contribution of the sedimentary sources especially when the PLSR model or the XRF tracers were used. This might have been an indicator of distinct rainfall characteristics. It was indeed the only summer storm considered here while the other events were occurring in autumn or spring. However, this event was also the one that performed worst in the test whether the sum of contributions was close to 100%, and it was already identified as an outlier in the accordance of the mixing models (Fig. 2). This might point to problems with the PLSR mixing model driven with color tracers during this event.

The within-event variability was very different between events. While for some events the source contributions were very similar for all samples (May 11, 2017, in Fig. 9b or October 23, 2013, in Fig. 4), they varied a lot between samples for other events. Out of the 11 events in the Claduègne catchment considered here, 5 had a very low within event variability, while the remaining 6 had a higher within event variability such as in Fig. 9a.

Fig. 9
figure 9

Within-event dynamic of the source contributions predicted with the color tracers during two floods in the Claduègne catchment

Differences in within-event variability of source contributions were also observed in other studies that conducted sediment fingerprinting at a high resolution. Brosinsky et al. (2014a) found a high within-event variability for one out of four events in a catchment in the Spanish Pyrenees (445 km2). Legout et al. (2013) classified 23 rain events in a mesoscale Mediterranean mountainous catchment (22 km2) according to source contribution variability and found that more than half of the events could be considered as highly variable. Possible factors that influence the time of concentration of the eroded sediments to arrive at the outlet and thus the within event variability are the spatial distribution of the sources within the catchment and characteristics of the rain event. The latter include the intensity and duration of the rain event as well as rainfall variability (highly located vs. homogeneous rain), the displacement of the rain cells or fronts over the catchment. Variability in source contribution within an event may therefore act as a tracer for rainfall–runoff processes in the catchment. In this way, sediment fingerprinting at a high resolution could help to understand hydrosedimentary processes in the critical zone.

Within- and between-event variability also emphasized the importance to consider sediment samples taken at a high resolution or integrate samples when mean source contributions from a catchment are to be determined. When few instantaneous samples are considered the results might be very sensitive to the time of sampling and “true” sources contributions might be considerably over- or underestimated.

5 Conclusions

The alternative sediment fingerprinting techniques tested in this study, i.e., based on spectrocolorimetry and XRF measurements, could discriminate between three sources of suspended sediment (sedimentary badlands, bare topsoils on basaltic geology, and bare topsoils on sedimentary geology). We investigated the different sources of error in sediment fingerprinting studies and examined the differences in predicted source contributions when different tracer sets or different mixing models were applied. We showed that the main source in the Mediterranean headwater catchment of the Claduègne (42 km2) was sedimentary badlands. Despite their low proportion of the catchments surface (< 1%), whatever the mixing model and the tracer set used, marly badlands contributed on average more than 70% to the suspended sediments sampled at the outlet.

In this study site which has a contrasted geology both low cost fingerprinting methods, i.e., spectrocolorimetry and XRF, were valid tools to conduct sediment fingerprinting at a high temporal resolution. Nonetheless, considerable uncertainties remained. These were mainly due to particle size selectivity, source heterogeneity, and choice of fingerprinting properties. During erosion and sediment transport, the sediments were enriched in smaller particle size fractions which were shown to have a different fingerprinting signature than coarser particles. This challenges the assumption of conservative behavior of the tracers and led to errors that ranged between 9 and 35% depending on the source and the mixing model. Source heterogeneity was another major source of error which might lead to a wrong characterization of the source’s fingerprints and thus to false source predictions. It was quantified here as the error of predicted source contributions of the soil samples, ranging from < 5 to 18%.

Our results show that the choice of the tracer set was more important than the choice of the mixing model as different results were obtained using color or XRF tracers. This is a drawback of the two low-cost methods tested in this study as the two tracer sets do not give unambiguous results. Notably, the mean source contribution of the basaltic soils was predicted differently with the two methods and the correlation between predictions obtained with the two tracer sets was poor. The use of a third low-cost tracer set (i.e., magnetic susceptibility) suggests that the color tracer sets led to the more plausible results. Because of the small contribution of the basaltic source (1 to 8%) and its magnetic variability, the magnetic susceptibility measurements were not able to quantitatively predict the sources. These results question not at all the significance of the magnetic susceptibility in terms of sediments sources. In the present case of study, this tracer should therefore be considered using another approach and further experiments such as the study of laboratory magnetizations are needed. Thus, a major result of this study was that there is a strong need to use multiple tracer sets to justify the results of suspended sediment fingerprinting studies and to obtain reliable estimates of source contributions with multi-tracer ensemble predictions. Another reason for the need of multi-tracer/multi-model ensemble predictions is that the sensitivity of the three mixing models and the two tracer sets to several sources of error varied a lot. Thus, the results obtained when only one tracer set and one mixing model are used might be biased considerably by one kind of error. On the other hand, this can be detected and mitigated by applying various mixing models run with different tracer sets.

Another main finding of the study in the mesoscale catchment was the considerable within- and between-event variability. This highlighted the importance of high-resolution sampling and fingerprinting of suspended sediments to obtain reliable estimates of the main source contributions. It is also important for process understanding as high-resolution data on sediment sources has a high potential for a more distributed picture of rainfall–runoff–erosion–sediment transport processes in the catchment because the sediments act as tracers of the governing hydrosedimentary processes in the catchment.