1 Introduction

Sediment connectivity relates to the physical transfer of sediment through a drainage basin (Bracken and Croke 2007). The identification of sediment source areas and the way they connect to the channel network are essential for environmental management (Reid et al. 2007), especially where high erosion and sediment delivery rates cause severe on- and off-site effects. An off-site effect of world-wide importance is the sedimentation of reservoirs and the corresponding loss in water storage capacity (Verstraeten et al. 2006), with an estimated annual loss in storage capacity of the world’s reservoirs of around 0.5–1 % and for individual reservoirs of 4–5 % (WCD 2000).

Connectivity is mainly determined by the spatial organization of the catchment’s heterogeneity (Van Nieuwenhuyse et al. 2011), where topography, surface roughness, anthropogenic structures and vegetation cover (and its spatial arrangement and temporal dynamics) play vital roles in the redistribution of water and sediment resources. In particular, dryland areas are characterized by heterogeneous vegetation cover with seasonal to long-term changes as a consequence of agricultural management, fire, land abandonment, climate change and other factors.

While most studies on flows over shrubland are conducted at small scales, often based on field experiments, connectivity has rarely been investigated at the landscape scale (Turnbull et al. 2008) and is still often not described sufficiently in hydrological catchment models (De Vente et al. 2006). However, observed ecohydrological interactions at patch/inter-patch scales have profound effects and management implications at the catchment scale (Ludwig et al. 2005). Here, remote sensing may provide adequate, spatially explicit surface information at a scale relevant for land management. Several authors stress the potential of remotely sensed data for understanding the patterns and processes of connectivity (King et al. 2005; Vrieling 2006; Bracken et al. 2013), which has not yet been fully exploited. In recent years, earth observation technology has made tremendous progress. This opens up new opportunities for retrieving quantitative surface information at a spatial resolution allowing for the characterization of relevant landscape patterns, a temporal resolution adequate to capture landscape dynamics and a spectral resolution suitable to quantify relevant surface covers. The latter is provided by hyperspectral sensors or imaging spectrometers that record the light reflected from the ground in many, narrow contiguous bands. The concept of imaging spectrometers originated in the 1980s with the first airborne sensors and has since then improved and been increasingly employed for earth science applications. Today, hyperspectral data are increasingly available from a rising number of airborne imaging spectrometers and a few spaceborne exploration missions. However, the lack of spatial and temporal continuity in airborne and spaceborne imaging spectrometer data, as well as the demanding processing of these complex data, is limiting their widespread use (Plaza et al. 2009; Schaepman et al. 2009). Imaging spectroscopy has been used for various soil mapping and soil degradation studies over the past few years (Ben-Dor et al. 2009) based on its potential to identify surface materials and to quantify surface properties. Furthermore, hyperspectral data allow relative abundances of material components on the surface to be derived by unmixing pixel spectra (Goetz 2009). Spectral mixture analysis has proven to be a promising tool for retrieving subpixel information on vegetation and soil surfaces, especially for the heterogeneous patterns of dry and green vegetation and soil patches that are typically found in dryland areas (Okin et al. 2001; Ustin et al. 2004). Another recent development in remote sensing that facilitates sediment connectivity research is the increasing availability of multi-sensor data, i.e. data simultaneously collected with different sensors, such as hyperspectral and Light Detection and Ranging (LiDAR) data. Thus, concurrent spatial information on several of the factors driving sediment connectivity can be retrieved.

Spatially explicit quantitative information obtained from remotely sensed data facilitates the use of connectivity indices. In recent years, a large number of these indices have been developed in order to quantitatively evaluate the connectivity of hydrological systems (Antoine et al. 2009). They aim at supporting a better understanding of water and sediment redistribution processes, allowing for the identification of hot spot areas of erosion and the study of the effects of erosion control measures and different land management scenarios. These indices are a simplified surrogate for hydrological functioning and have different abilities to reflect complex interactions, while emphasizing different factors as dominant drivers. Bracken et al. (2013) provide an overview of proposed hydrological indices. Among these, the index of connectivity, originally introduced by Borselli et al. (2008), has already been applied for different regions and scales (Sougnez et al. 2011; Cavalli et al. 2013; López-Vicente et al. 2013) and was successfully used to improve prediction of sediment yields in a semi-lumped catchment model (Vigiak et al. 2012). The index of connectivity provides an estimate of the potential connection between the sediments eroded from hillslopes and the stream system, while taking into account land surface and topographic characteristics (Borselli et al. 2008).

In this work, we propose an approach to exploit high-resolution airborne data for overland flow sediment connectivity estimation. More specifically, we investigate the potential of hyperspectral and LiDAR data for assessing sediment connectivity at the hillslope to subcatchment scale for a mesoscale catchment using the index of connectivity. The study catchment in the Spanish Pyrenees experiences high erosion and sediment delivery rates, while badlands are considered to contribute a major proportion of the sediments to the channel network.

2 Study area and data

2.1 Study area

The study area encompasses the Villacarli (42 km2) and Carrasquero (25 km2) subcatchments of the mesoscale, semi-humid Isábena catchment (445 km2) located in the southern Pyrenees in north-eastern Spain (Fig. 1). The catchment is characterized by a rough terrain (650 m above sea level (a.s.l.) in the south to 2,600 m a.s.l. in the north), resulting in a pronounced climatic and land cover gradient. Strong inter-annual and seasonal variability of precipitation, temperature and local growth conditions (e.g. due to relief, lithology and land use) create a highly heterogeneous landscape. High altitudes are dominated by shrubland, meadow, woodlands and bare soil/rock, while valley bottoms are mainly used for agriculture. The wide abundance of Miocene marls leads to the formation of badlands (i.e. areas of unconsolidated sediments or poorly consolidated bedrock with little or no vegetation; Gallart et al. 2002). Contemporary geomorphic processes are mainly dominated by fluvial erosion on slopes and in the badlands during floods typically occurring in spring and in late summer and autumn (López-Tarazón et al. 2009). The Isábena River is characterized by large sediment yields indicating high connectivity between the source areas and the fluvial network (López-Tarazón et al. 2012). Apart from the badlands, arable land and shrubland are seen as major sources of sediment delivered to the Barasona reservoir at the outlet of the Isábena catchment. In consequence, the initial capacity of the reservoir of 92 hm3 has been considerably reduced by siltation over the past several decades (Valero-Garcés et al. 1999).

Fig. 1
figure 1

Location of the Isábena catchment in Spain (a) and the Villacarli and Carrasquero subcatchments in the north-western part of the Isábena catchment (b)

2.2 Hyperspectral data

Airborne Imaging Spectrometer for Application (AISA) Eagle and Hawk imaging spectrometer data (Specim Ltd., Oulu, Finland) were acquired at an altitude of 4,200 m on April 2 and August 9, 2011, with a ground sampling distance (GSD) of 4 m in 12 and 15 flight lines, respectively. AISA records reflected solar radiation from the visible (VIS) to the shortwave infrared spectral region (SWIR) (400 to 2,500 nm). Data acquisition and radiometric correction were conducted by Natural Environment Research Council (NERC, UK). Subsequent geocorrection was performed using in-house software developed at the German Research Centre for Geosciences (GFZ). Atmospheric correction was done using Atmospheric/Topographic Correction for Airborne Imagery (ATCOR-4) (Richter and Schlaepfer 2002). Mosaicking of the flight lines was realized in ENVI 4.8 (Exelis Visual Information Solutions, Boulder, Colorado, USA). Subsequently, refined georegistration of the image mosaics was performed based on orthophotos provided by the Spanish National Centre for Geographic Information (CNIG). Final geometric accuracy varied between 0 and 2 image pixels (i.e. 0 and 8 m), with the largest deviations in the mountainous north. To further adjust the surface reflectance of the image mosaics, empirical line correction was performed using field spectra collected during the airborne campaigns. Additionally, the image mosaics were optimized by removing the water absorption features (Painter et al. 1998, Roberts et al. 1998b), filtering the spectra using a Savitzky-Golay filter (Savitzky and Golay 1964) and removing saturated (>90 % reflectance) and negative (not physically meaningful) pixels. For final analysis, 380 spectral bands remained, and 11.1 % of the April and 5.6 % of the August image pixels were excluded.

2.3 Field data collection

In two field campaigns concurrent with the airborne image acquisitions, fractional cover of green vegetation (GV), dry vegetation assumed to be photosynthetically non-active (NPV), bare soil and rock were visually estimated for 60 (April) and 53 (August) transects of 20-m length (Fig. 1). Visual estimation was carried out in 10 % steps for 1 m × 1 m plots every 2 m along the transects using the quadrate sampling method (Kreeb 1983, Coulloudon et al. 1999, Kercher et al. 2003). Estimates were averaged for each transect. Nadir photographs of each estimation site were taken, the position was measured using a hand-held GPS, the vegetation height was measured, and the land use type was recorded.

These field data were subsequently used to validate the image analysis results on the level of cover fractions and, after determining C factors from ground reference data (Sect. 3.3), on the level of C factors; the C factor is the cover and management factor in the Universal Soil Loss Equation (USLE) reflecting the effect of ground cover and management practices on erosion rates. For validation, transect averages were compared with the image analysis results for the corresponding image pixels.

2.4 LiDAR data

Airborne LiDAR data were acquired by NERC with a Leica ALS50 instrument in single-pulse mode (maximum of four returns per given pulse recorded) on August 2011 concurrent with hyperspectral data acquisition. The average flight altitude of 4,200 m resulted in an average point density of 0.7 hits per meter squared. The mean error magnitude was 3.3 cm with a standard deviation of 4.1 cm for 2,500 m altitude, with an additional maximum error of 10–15 cm at the edges of the swath due to a systematic roll boresight bias (NERC 2011).

Preprocessing of the LiDAR point clouds was carried out by the Institute for Earth and Environmental Sciences at the University of Potsdam (Bauer 2013) applying LAStools (Martin Isenburg, rapidlasso GmbH, rapidlasso.com). It comprised the classification of the point cloud into ground and non-ground points and the generation of a digital elevation map (DEM; including only ground points) as well as a vegetation height map, both with 4-m spatial resolution. In a further step, the DEM was hydrologically corrected for local pits using Terrain Analysis Using Digital Elevation Models (TauDEM 5.0, hydrology.uwrl.usu.edu/taudem/taudem5.0/index.html).

3 Methods

3.1 Multiple endmember spectral mixture analysis

Spectral mixture analysis (SMA) models the apparent surface reflectance P of an image pixel i as the linear sum of N endmembers weighted by the fraction f ik of each endmember within the instantaneous field of view of pixel i (e.g. Adams et al. 1993; Roberts et al. 1998a). That is, for a given wavelength, λ

$$ {P}_{i\lambda }={\displaystyle {\sum}_{k=1}^N{f}_{ik}\ast {P}_{\lambda k}+{e}_{i\lambda }} $$
(1)

The fit of the model is assessed by an error metric based on the residual term e , indicating the error between the measured and modelled spectra. The standard error metric for SMA is the root-mean-square error (RMSE) of the residuals for each pixel across all bands given by the following:

$$ {\mathrm{RMSE}}_i={\left({\displaystyle {\sum}_{k=1}^{\lambda }{\left({e}_{ik}\right)}^2/N}\right)}^{1/2} $$
(2)

The modelled fractions are typically constrained by assuming that the physical abundance of the materials present in each pixel sums up to a total of 100 % (Okin et al. 2001):

$$ {\displaystyle {\sum}_{k=1}^N{f}_{ik}=1} $$
(3)

Spectral mixture analysis techniques have been successfully applied for quantifying vegetation cover in dryland areas (e.g. Elmore et al. 2000; McGwire et al. 2000; Asner and Heidebrecht 2002; Peterson and Stow 2003; Bachmann 2007; Gill and Phinn 2009; Numata et al. 2007). In standard SMA approaches, a fixed number of representative endmembers is selected which may not effectively model all elements in the image, or pixels may be modelled by endmembers that do not correspond to the material located in the field of view. As a result, accuracy of the estimated fractions is low (Sabol et al. 1992). The limitations of the SMA approach are particularly problematic in highly heterogeneous landscapes, such as in the Mediterranean, especially at fine spatial scales. A technique that addresses these restrictions is multiple endmember spectral mixture analysis (MESMA), which allows the number and type of endmembers to be varied on a per pixel basis (Roberts et al. 1998b) and thus accounts better for in-class variability. In this study, MESMA was applied to the hyperspectral AISA images to estimate fractional cover for GV, NPV, soil and rock. For this study, all endmember spectra were derived from the image data sets. The main advantage of using image endmembers (rather than field or lab spectra) is that they are collected at image scale and are thus easier to correlate with image features (Rashed et al. 2003). The spectral endmember library was set-up using VIPER tools (ENVI add-on; www.vipertools.org). The MESMA library for the April data set used in this study included 10 endmembers for the GV class, eight for NPV, five for bare soil and two for rock. For the August data set, 11 endmembers for GV were distinguished, six for NPV, six for bare soil and two for rock (Fig. 2).

Fig. 2
figure 2

Endmember library setup for Multiple Endmember Spectral Mixture Analysis (MESMA) for the August image mosaic including 11 endmembers for the green vegetation (GV) class (a), non-photosynthetic vegetation (NPV) (b), six for bare soil (c) and two for rock (d). Dashed lines indicate mean and dotted lines standard deviation. Blue bars indicate water absorption bands that cannot be used in the analysis

The endmember library was used to estimate the fractional abundance of each class for each pixel in the image. Two-, three- and four-endmember models were applied. To account for variations in illumination and in spectral albedo, a shade endmember was included (i.e. a spectrum with a reflectance of zero in all bands) (Dennison and Roberts 2003). Multiple endmember spectral mixture analysis was run in a partially constrained mode with the following constraints: (a) The minimum and maximum allowable fraction values range between −0.05 and 1.05, meaning that slightly negative fractions and fractions slightly larger than 100 % are acceptable, (b) the shade fraction values have a maximum allowable fraction of 80 % to prevent exclusion of well-fitting models despite a high shade component in the pixel, and (c) a commonly accepted RMSE threshold of 0.025 must be complied following Dennison and Roberts (2003). Each two-, three- or four-endmember model meeting the constraints was evaluated for every single image pixel, selecting the model with the minimum RMSE value (Painter et al. 1998). If no model met the constraints, the pixel was left unmodelled. As a result, an image containing the best-fit model per pixel and the corresponding fractional value of each endmember (i.e. GV, NPV, soil and rock) was produced. Since shade was not considered as a land cover component, the estimated fractions of each pixel were shade-normalized following the procedure of Adams et al. (1993). The modelled fractions were rescaled to range between 0 and 100 % by assuming that the physical abundance of the materials present in each pixel sums up to a total of 100 %.

3.2 Land use classification

Multiple endmember spectral mixture analysis results provide estimates of the cover fractions on a pixel-by-pixel basis independent of the land use class that the respective pixels belong to. In the subsequent C factor estimation, however, a land use class-wise procedure was applied. Therefore, in addition to the MESMA approach, supervised land use classification was performed using the support vector machine (SVM) classifier Image SVM 2.1 (Rabe et al. 2010). SVM classification has shown to be particularly suitable for high-dimensional multi-collinear image data and has the advantage of requiring only small training data sets (Foody et al. 2006; Plaza et al. 2009).

The SVM classification was based on the fused April and August image data sets to account for seasonal vegetation cover changes and hence improve classification performance. Furthermore, prior to SVM classification, principle component analysis (PCA) was applied on the spectral data to reduce noise and increase data variance (Richards 1999). The PCA was applied on the original reflectance data as well as on spectra that were normalized by continuum removal (CR) (Clark and Roush 1984), resulting in a final reduction of the number of bands from 380 to 235 for the reflectance data and 380 to 45 for the CR data. SVM classification was then performed using different input data sets, namely, (i) the reflectance data only, (ii) the CR data only and (iii) reflectance and CR data combined. Subsequently, accuracy was tested for each case. To train the classifier, ~3,000 pixels, corresponding to 0.1 % of the entire data set, were used based on ground reference data representing a total of eight land use classes. Class selection was based on Mueller et al. (2009) in order to compare to previous studies. Classification accuracy was assessed based on ground truth data of approximately 8,000 pixels.

During post-classification, the raster-based land use information was aggregated by majority filtering (kernel size = 7 × 7; weight = 5) and elimination of areas <2,000 m2 to create coherent land use classes.

3.3 Cover and management (C) factor mapping based on Revised Universal Soil Loss Equation

The USLE and its modified version, the Revised USLE (RUSLE), are widely used empirical models for assessing long-term averages of soil loss based on the product of six erosion risk factors, namely, the rainfall and runoff factor (R), the soil erodibility factor (K), the slope-length factor (L), the slope-steepness factor (S), the cover and management factor (C) and the support practice factor (P) (Wischmeier and Smith 1978). The C factor is calculated as follows:

$$ C={\displaystyle {\sum}_{i=1}^n\frac{{\mathrm{SLR}}_i\ast {\mathrm{EI}}_i}{\mathrm{EI}}} $$
(4)

where SLRi describes the soil loss ratio for time period i, EI i represents the rainfall and runoff erosivity during period i, and n is the total number of periods. Thus, each SLR i value is weighted by the fraction of rainfall and runoff erosivity (EI) associated with the corresponding time period, and these weighted values are combined into an overall C factor value. In this study, the focus is placed on the spatial and temporal surface cover dynamics and its effect on C factor estimation, whereas the dynamics of rainfall and runoff erosivity was purposely not considered by giving equal weights to all time periods. Therefore, C factor values were estimated by calculating SLR without taking changes in EI into account.

For calculating the individual SLR values, a subfactor approach is introduced in RUSLE that considers several surface characteristics related to surface cover and land use (Renard et al. 1997) based on the work of Laflen et al. (1985) and Weltz et al. (1987). An individual SLR i (0–1) value is thus calculated for each time period i as follows:

$$ {\mathrm{SLR}}_i={\mathrm{PLU}}_i\ast {\mathrm{CC}}_i\ast {\mathrm{SC}}_i\ast {\mathrm{SR}}_i\ast {\mathrm{SM}}_i $$
(5)

where SLR i is the soil loss ratio for the given conditions, PLU i the prior land use subfactor, CC i the canopy cover subfactor, SC i the surface cover subfactor, SR i the surface roughness subfactor and SM i the soil moisture subfactor.

The CC subfactor is a function of the fraction of the land covered by canopy F c and the effective fall height of raindrops H and is calculated as follows:

$$ \mathrm{C}\mathrm{C}=1-{F}_{\mathrm{c}}\ast {e}^{\left(-0.1\ast H\right)} $$
(6)

The SC subfactor is calculated as follows:

$$ \mathrm{S}\mathrm{C}={e}^{\left[-b\ast {S}_{\mathrm{p}}\ast {\left(\frac{0.24}{R\mathrm{u}}\right)}^{0.08}\right]} $$
(7)

where S p is the percentage of land area covered by surface cover, b is the effectiveness of surface cover in reducing soil erosion (empirical coefficient) and R u is the random roughness. The SR subfactor is calculated as follows:

$$ \mathrm{S}\mathrm{R}={e}^{\left[-0.66\left({R}_{\mathrm{u}}-0.24\right)\right]} $$
(8)

Published values of C factors can vary from 0, for example, for woodlands with 100 % ground cover, to 1 for bare soil areas (Pierce et al. 1986).

The C factors were estimated for each ground reference site based on the field data collected and on literature values. Surface and canopy covers as well as vegetation height obtained in the field campaigns were used, and random roughness was estimated based on reference photographs (Renard et al. 1997); PLU and SM were set to 1 (Verstraeten et al. 2002; Schiettecatte et al. 2008), and b was set as a land cover-dependent constant (Renard et al. 1997). This way, the C factor estimation for the ground reference sites used for accuracy assessment was completely independent of the subsequent C factor estimation based on remotely sensed data.

In the remote sensing approach, spatially explicit C factor values were estimated for the land use classes shrubland, arable land and badland (obtained from land use classification; Sect. 3.2), which make up a large part of the study area and are assumed to contribute the largest proportion of sediments to the channel network. For C factor mapping, the estimated ground cover fractions (obtained from MESMA; Sect. 3.1) were linked to the variables of the RUSLE subfactors CC and SC for both dates separately. The fractional ground cover obtained from hyperspectral image analysis does not account for the vertical vegetation distribution, since spectral pixel information is simultaneously affected by the spectral characteristics of canopy and surface vegetation (Guyot et al. 1989). Thus, we linked the MESMA-derived abundances of GV, NPV and rock with the subfactors in three different ways and tested the overall accuracy for each case: (i) GV is assigned to canopy cover (F c), NPV and rock to surface cover (S p); (ii) GV and NPV are assigned to F c and rock to S p; and (iii) GV is assigned to F c, NPV to F c and to S p, and rock to S p. As suggested in Dissmeyer and Foster (1981), we assumed that rock has a positive effect on reducing soil erosion and is therefore assigned to surface cover (S p). Vegetation height H is based on the LiDAR-derived height map (Sect. 2). The PLU and SM were again set to 1, while R u and b were set as land cover-dependent constants based on Renard et al. (1997).

As a result, shrubland, arable and badland areas were mapped by continuous C factor values, while constant C factors adopted from Antronico et al. (2005) and Mueller et al. (2009) were assigned to the remaining land use classes that are assumed to exhibit much less variability across space and time (Table 1). Data gaps remaining after data preprocessing and MESMA were filled by constant C factors per land use class based on MARM (2008).

Table 1 Constant C factors assigned to remaining areas not mapped by continuous C factors

3.4 Index of connectivity

Sediment connectivity was assessed using the index of connectivity (IC) proposed by Borselli et al. (2008) and further adapted to the use of high-resolution digital elevation models by Cavalli et al. (2013). For each cell in the catchment, the IC estimates the upslope component D up and the downslope component D dn (Fig. 3). The parameter D up represents the characteristics of the upslope contributing area and thereby summarizes the potential for downwards routing of the sediment produced upstream. The parameter D dn accounts for the characteristics of the flow path from a specific cell to the stream network and hence expresses the probability that sediment arrives at a sink along a flow line. This way, IC provides an estimate of the potential of sediment eroded from the hillslope to be connected to the stream system (López-Vicente et al. 2013), and IC is calculated as follows:

Fig. 3
figure 3

Definition of upslope and downslope components of the index of connectivity (from Borselli et al. 2008)

$$ I{C}_k=lo{g}_{10}\left(\frac{D_{up,k}}{D_{dn,k}}\right)=lo{g}_{10}\left(\frac{{\overline{W}}_k{\overline{S}}_k\sqrt{A_k}}{{\displaystyle {\sum}_{i=k,{n}_k}\frac{d_i}{W_i{S}_i}}}\right) $$

where \( \overline{W} \) is the average weighting factor for the upslope contributing area (−), \( \overline{S} \) is the average slope gradient for the upslope contributing area (m m−1), A is the upslope contributing area (m2), d i is the length of the ith cell along the downslope path to the sink (m), W i is the weight of the ith cell (−), and S i is the slope gradient of the ith cell (m m−1). The subscript k indicates that each cell has its own IC value. The IC is dimensionless and defined in the range [−∞; +∞] with connectivity increasing as IC approaches +∞.

The weighting factor represents the impedance of runoff and sediment fluxes due to ground cover and surface roughness. Borselli et al. (2008) proposed using the C factor as a weighting factor as a widely applied parameter that can be explicitly related to observable and measurable characteristics of land use and management. In this study, spatially explicit C factor maps were derived from remotely sensed data (Sect. 3.3) as input in the IC estimation. Furthermore, the LiDAR-derived DEM was input in the IC calculation. As proposed by Cavalli et al. (2013), we constrained the slope values between 0.005 and 1 m m−1 and used the multiple flow D-infinity approach (Tarboton 1997), since this approach is better suited to represent divergent flow over hillslopes than the single-flow algorithm applied in the original IC model. Cavalli et al. (2013) introduced two different scenarios for the application of the index, namely, analyzing sediment connectivity across the whole catchment between hillslopes and catchment outlet (“IC outlet”) and analyzing sediment connectivity between hillslopes and main channels (“IC channel”). In this study, IC values were calculated with regard to the main channels, assuming that redistribution processes from the hillslopes to the channels are highly relevant for the overall sediment yield of the catchment and that these are the areas where effective erosion control measures can be applied.

4 Results

4.1 Multiple endmember spectral mixture analysis

More than 95 % of the image pixels for both data sets were successfully modelled. Undefined pixels (1.1 % April; 4.3 % August) resulted from differences between reference and modelled spectra. Four-endmember models were chosen for 77 % (April) and 80 % (August) of the images, including most of the shrublands. Shrubland areas are characterized by a mosaic of patches of green and dry vegetation as well as bare soil and rock smaller than the 4-m GSD of the images. The resulting high spectral variability led to the preference of four-endmember models. Three-endmember models, however, were predominantly chosen by the algorithm for more homogenous land use types, such as arable land and badlands.

Figure 4 shows a subset of the August image with the cover fractions derived using MESMA for GV, NPV, soil and rock. The vegetation fraction (GV and NPV) makes up the largest part of the study area. The fractions of NPV appear scattered, notably on shrubland and meadow areas. Shrubland areas are dominated by NPV in April (40 %) followed by GV (23 %), whereas in August, GV dominates (also 40 %) followed by NPV (27 %). For meadows, the factional cover of NPV is similar in April and August (33 and 29 %, respectively), while it increases for GV (47 to 58 %). The arable land is mainly covered by GV in April (61 %), while fractional covers of bare soil and NPV (15 and 61 %, respectively) dominate in August, indicating residue cover after harvest. Coniferous forests in the north are modelled with high abundances of GV for both dates (71 and 79 %). High abundances of NPV in deciduous forests in April (73 %) can be explained by dry leaves, while in August, the canopies turn green, and hence, the fractional cover of GV dominates (74 %).

Fig. 4
figure 4

Fractional cover of green vegetation (GV) (a), non-photosynthetic vegetation (NPV) (b), soil fraction (c) and rock (d) for a subset of the August image mosaic. High abundances of the cover classes are indicated by dark shades and low abundances by brighter shades, while black pixels indicate that the cover class is not present. Additionally, selected model complexity of Multiple Endmember Spectral Mixture Analysis (MESMA) (e), land cover resulting from support vector machine (SVM) classification (f) and the original image in true colours (g) are shown for the same subset

Accuracy was assessed in two ways based on field estimates, firstly on the estimated dominant ground cover class per pixel and secondly on the estimated fractional cover per pixel for both image mosaics. For the April image mosaic, the dominant ground cover fraction per pixel resulted in an overall accuracy of 65.2 %, with the best results obtained for GV (83.3 %) and the poorest for rock (16.7 %). The latter was mainly confused with the soil cover fraction. Estimated GV abundances provided the best results (R 2 = 0.70, RMSE = 0.16); soil and rock cover fraction prediction was poor (R 2 = 0.22, RMSE = 0.26 and R 2 = 0.23, RMSE = 0.19, respectively). Generally, accuracy for land use classes with high vegetation cover abundances is high for GV and NPV and low for soil and rock cover fractions (Fig. 5). Furthermore, if the soil cover fraction is underestimated, NPV is overestimated. However, land use classes with high fractional abundances of bare soil or rock showed a reversed behaviour.

Fig. 5
figure 5

Reference cover fractions vs estimated cover fractions using Multiple Endmember Spectral Mixture Analysis (MESMA) for the following: green vegetation (GV) (a/e), non-photosynthetic vegetation (NPV) (b/f), soil (c/g) and rock (d/h) for April (left column) and August (right column). Solid lines indicate 1:1 line, dashed lines 10 % deviation and dotted lines 20 % deviation

For the August image mosaic, overall accuracy of the estimated dominant ground cover fraction per pixel is slightly lower (57.9 %). The best results were obtained for the soil cover fraction (100 %), the poorest again for the rock fraction (16.7 %). Class confusion appeared mainly between the covered fractions (GV and NPV) and the uncovered fractions (soil and rock). Estimated GV abundances for all reference data provided the best results (R 2 = 0.63, RMSE = 0.20), while rock cover fraction prediction was poor (R 2 = 0.19, RMSE = 0.23). Overall accuracy for all land use classes, except shrubland, over all ground cover fractions is good (mean error <20 %) (Fig. 5).

4.2 Land use classification

Eight classes were distinguished in the land use classification: shrubland, arable land, rock, bare soil, deciduous forest, coniferous forest, meadow and badland (Fig. 4f). The best overall accuracy of 88 % was obtained using a combination of reflectance and CR spectra in the SVM classification. Land use classes with expected high vegetation cover provided high accuracies (84–94 %), while land use classes with high abundances of bare soil or rock tended to get confused with other classes (63–74 %), particularly with shrubland. Shrubland and coniferous forests constitute the dominant land use types (49 and 28 %, respectively) in the study area, while badlands and bare soil areas make up 2 and 1 %, respectively. Meadow/pasture (8 %), deciduous forest (8 %) and arable land (5 %) have approximately equal shares.

4.3 C factor mapping based on Revised Universal Soil Loss Equation

Estimated ground cover fractions were assigned to the variables of the RUSLE subfactors. The best results were achieved when assigning GV to canopy cover (F c), and NPV and rock together to surface cover (S p). However, accuracy was higher for land use classes with high vegetation cover (i.e. shrubland/arable land) than for land use classes with low vegetation cover (i.e. badland).

The obtained C factor maps for April and August are presented in Figs. 6 (subset) and 7 (entire study area). Shrubland, arable land and badlands are mapped by continuous C factor values derived from the proposed remotely sensed approach, while spatially and temporally constant C factors (Table 1) were assigned to the remaining land use classes as well as to pixels excluded during the image analysis process. Badlands and bare soil areas exhibit the highest C factor values for both dates. There is no change or only a slight increase in C factor values for most areas (mainly shrubland and badland areas) between April and August, indicating an increase in erodibility, whereas for some areas (mainly arable land), a slight decrease in C factor values was found, indicating a lower erodibility. On average, C factors slightly increase from 0.11 (April) to 0.14 (August) for Villacarli and 0.09 (April) to 0.10 (August) for Carrasquero (Table 2).

Fig. 6
figure 6

C factor map for subset (same as in Fig. 4) for April (a) and August (b)

Fig. 7
figure 7

C factor map for the entire study area for April (a) and August (b) and the change from April to August (c)

Table 2 Statistics on the C factors and index of connectivity (IC) values obtained for the Villacarli and Carrasquero subcatchments

These observations are in line with the distribution curves of C factors per land use class (shrubland, arable land, badland) and date (April, August) in Fig. 8. The distribution curves differ in their range and shape, representing the spatial and temporal variability of C factor values within these three land use classes. The C factor distribution for badland shows two frequency maxima near 0.02 and 0.7 and a flat curve shape, whereas arable land and shrubland are characterized by steep distribution curves with frequency maxima between 0.01 and 0.1. Mean C factor values for badland for both dates are higher (0.35) in comparison to the land use classes arable land and shrubland (0.15 each).

Fig. 8
figure 8

Distribution curves of the C factors by land use type for April and August

The correlation between reference and modelled C factors was high for the August image mosaic (Fig. 9) (R 2 = 0.71). The low mean absolute error (MAE = 0.09) and root-mean-square error (RMSE = 0.11) indicate good model prediction. In contrast, correlation was poor (R 2 = 0.04) for the April image mosaic. However, overall accuracy is acceptable as the mean error is <20 %. With increasing C factors, the modelled C factors are consistently underestimated relative to the reference C factors, particularly in the land use classes arable land and badland.

Fig. 9
figure 9

Reference C factors vs estimated C factors for the land use classes arable land, shrubland and badland for April (a) and August (b)

4.4 Index of connectivity

The spatially explicit IC was calculated for the entire Villacarli and Carrasquero subcatchment and gives an estimate of how sediment sources and stream network are connected and how the connectivity changes between April and August (Fig. 10). Differences between both dates can be attributed to different input values of the weighting factor W (C factor), whereas the other input data remain the same for both dates. A change in one cell will have an effect on flow path values for all upstream IC calculations and on the contributing area values for all downstream IC calculations, and hence, there are only a few areas (mainly forested areas in the upstream parts of the catchments) that show no changes in IC between April and August.

Fig. 10
figure 10

Connectivity map for the entire study area for April (a) and August (b) and the change from April to August (c)

As expected, the highest connectivity values are found close to the channels and in areas with sparse vegetation, while there are also some parts of the catchments that seem to be hardly connected to the channel network. Most parts of the catchments show an increase in connectivity from April to August with the largest changes in badland and shrubland areas. Areas experiencing a decrease in connectivity are mainly related to arable land and meadow/pasture, for example, in the north-western part of Villacarli. The general increase in connectivity from April to August is also reflected in the average IC values for both catchments (Table 2). When comparing both catchments, Villacarli is characterized by a higher average connectivity that can be attributed to the topographic characteristics and the distribution of C factors. The distribution curves of the IC values by subcatchment and date (Fig. 11) exhibit higher frequencies between −10 and −8 as well as between −4 and 0 for Villacarli as compared to Carrasquero, whereas higher frequencies are observed for Carrasquero for the range −8 to −4. This pattern is found for April as well as for August, indicating that catchment topography and land cover characteristics have a greater influence on the IC value distribution than seasonal differences in vegetation cover. For both dates, IC values between −4 and 0 are related to badland and bare soil areas close to river channels, whereas IC values between −10 and −8 are related to forests.

Fig. 11
figure 11

Distribution curves of the index of connectivity (IC) values by subcatchment for April and August

5 Discussion

Many authors have shown that not only the extent of vegetation, but also the spatial configuration of vegetated and bare areas, affects the redistribution of resources in semi-arid areas (Ludwig et al. 2005; Puigdefábregas 2005; Turnbull et al. 2008). Vegetation patterns can be regarded as a structural factor remaining static during a storm event (Reaney et al. 2014). Over longer time periods, however, vegetation density and its spatial distribution may change, resulting from disturbances such as grazing, fire or deforestation, but also in response to resource flows creating patches or banded vegetation typical for many semi-arid hillslopes. In consequence, the long-term effectiveness of vegetation patches to obstruct flows and retain water and soil resources within semi-arid landscapes may also change (Ludwig et al. 2005). Apart from long-term changes in vegetation density and patterns, seasonal changes in vegetation cover may also affect the redistribution of resources and connectivity at hillslopes during the course of a year. In this study, information on the spatial patterns and temporal changes of vegetation cover were derived from airborne hyperspectral data acquired in April and August 2011 in two subcatchments having an overall size of 70 km2. Different from broadband sensors, the hyperspectral sensors used in this study, such as AISA, record spectral information in many narrow contiguous bands and thus allow relative abundances of material components on the surface to be derived by unmixing pixel spectra (Goetz 2009). The MESMA approach applied in this study was found to be particularly suitable for deriving abundances of vegetation and soil in heterogeneous Mediterranean landscapes predominantly covered by shrublands that are characterized by high spectral variability within the surface classes (Elmore et al. 2000; McGwire et al. 2000; Bachmann 2007). Shrublands make up nearly half of the study area (49 %, Sect. 4.2) and change across the area from nearly complete to patchy coverage. Vegetation patches in Mediterranean shrublands are typically in the order of 1 m2 or less in area, and since standard aerial photographs and high-resolution satellite images are also in this order of spatial resolution, they may be indispensible for characterizing vegetation patterns in sufficient detail for describing ecohydrological processes (Lesschen et al. 2008; Muñoz-Robles et al. 2012). Yet, standard aerial photographs and high-resolution satellite images without spectral information in the shortwave infrared range do not allow discrimination among photosynthetically non-active (i.e. dry and GV components and bare soil), hence mapping of total plant cover is limited. However, dry vegetation components make up a comparably large proportion of the overall vegetation cover in Mediterranean landscapes and thus have an influence on water and soil fluxes that should not be neglected (De Jong and Epema 2006). The MESMA approach proposed in this study accounts for subpixel heterogeneity by unmixing the spectral pixel information. The resulting fraction cover map does, however, not provide the correct location of vegetation patches and inter-patches within a pixel, but it gives the relative abundances of green and dry vegetation, bare soil and rock and achieves accuracies similar to comparable studies (e.g. Bachmann 2007). Underestimating or overestimating fractions can be mainly explained by erroneous reference data estimation and locational inaccuracies caused by choosing inadequately pure endmembers and incorrect unmixing model parameterization, by non-linear mixing effects not captured by the linear assumption of SMA and by the challenging study area (high spectral variability and rough terrain).

Apart from the subpixel derivation of fraction cover using MESMA, land use classification using SVM was performed on the same hyperspectral bi-temporal image mosaics. The SVM is particularly suited to high-dimensional imagery with limited training data (Plaza et al. 2009) and resulted in high overall classification accuracy (88 %). The shrubland class was often confused with other classes due to its high spectral variability with vegetation patches smaller than the image pixel size of 4 m. The land use classification result was used in the subsequent class-wise estimation of C factors based on the RUSLE approach.

The USLE/RUSLE is an empirical model for assessing long-term averages of sheet and rill erosion originally developed for agricultural land in the USA. It does not explicitly consider runoff or individual erosion processes of detachment, transport and deposition. Despite the empirical character and partly erroneous results, the model is widely applied for soil loss estimation. In this study, we solely used the cover and management factor of RUSLE that is based on subfactors for explicitly incorporating quantitative information on cover fractions and land management practices (Renard et al. 1997). It also allows for the differentiation of time-variant and time-invariant C factors, depending on the application and study area (pasture/rangeland vs agricultural land). However, most studies on catchment-wide soil erosion mapping still use annually and spatially averaged C factors per land use class based on published values, which do not reflect the large spatial variability (e.g. shrubland) or seasonal change (e.g. arable land) in the cover and management factor. To account for this spatial and temporal variability, remote sensing data are increasingly employed to estimate C factor values. Often, spectral ratios such as the normalized difference vegetation index (NDVI) are used as indicators of photosynthetically active vegetation (e.g. Wu et al. 2004; Kouli et al. 2009), while there are only a few studies on mapping erosion potential for mesoscale catchments that consider seasonal changes in land cover. Some recent studies relate cover fractions derived from remote sensing analyses to C factor values (De Asis and Omasa 2007; Meusburger et al. 2010). In this study, we proposed spatially explicit C factor mapping based on cover fractions linked to RUSLE’s canopy (F c) or surface cover (S p) subfactors, which also takes non-photosynthetically active vegetation as well as other factors (e.g. vegetation height) into account. The MESMA-derived cover fractions were assigned to the subfactors in three different ways, and overall accuracy was tested for all three approaches. In our study, highest accuracy was obtained when assigning GV to canopy cover (F c), and NPV and rock together to surface cover (S p), and hence, we applied this approach to all land use types. Thereby, accuracy was higher for land use classes with high vegetation cover (i.e. shrubland/arable land) than for land use classes with low vegetation cover (i.e. badland), indicating that the assignation is not universally transferable among study areas but needs adaptation depending on the type and distribution of vegetation cover present in the area. Alternatively, a land use-dependent assignation could be used. This way, C factors were mapped spatially explicit for the three land use types—shrubland, arable land and badland—that together have a 56 % share (Sect. 4.2) of the study area and exhibit the greatest spatial and seasonal dynamics. Also, they are expected to contribute the largest amount of sediment to the channel network, which is underpinned by the results of a spectral fingerprinting of sources of suspended sediments (Brosinsky et al. 2014). They found for the same study area that badlands were always the major sources; forests and grasslands contributed little and other sources (not further determinable, including arable land and shrubland) up to 40 %. For the land use types, shrubland, arable land and badland, C factors between April and August changed in different ways (Sect. 4.3), justifying the use of time-varying C factors as compared to annual averages. Other land uses, such as pasture, change very slowly with time, and hence, annual average C factors may be adequate (Renard et al. 1997). The spatially and temporally averaged constant C factors taken from the literature (Table 1) fit the spatially explicit C factors derived from the image analysis to different extents (Fig. 8). While for badland, a constant C factor of 1 is assumed, and C factors derived from the image analysis vary between 0.0 and 0.9, with the majority of values at 0.02 in April and 0.7 in August, indicating a decrease in vegetation cover from April to August. Similarly, C factors for shrublands show a slight increase from April to August with the majority of values at 0.02 and 0.04, respectively, while the constant value derived from the literature for shrublands in similar studies is in the same range (0.031). The constant C factor for arable land (0.25) obtained from the literature seems to be an average annual value representative of the seasonal changes in C factors in arable land. C factors for arable land derived from the image analysis vary between 0.0 and 0.9, with frequency peaks at 0.01/0.08 in April and at 0.04 in August. Despite the fact that most fields are harvested before August and should therefore be expected to exhibit high C factors at that time, for most arable lands, the C factor seems to decrease from April to August. This indicates that total vegetation cover increases, which can be explained by crop residues left after harvest that protect against soil erosion (López-Vicente et al. 2008).

For the calculation of IC, a weighting factor represents how water and sediment fluxes are obstructed by ground cover and surface roughness. The weighting factor should be chosen depending on the characteristics of the study area. While Cavalli et al. (2013) propose using a Roughness Index based on a digital terrain model for their alpine (largely unvegetated) study area where fluxes mainly depend on topography, Borselli et al. (2008) propose using the C factor of RUSLE for regions where vegetation cover and land use management play an important role for sediment fluxes, such as in our study area. One has to bear in mind that in this study, a temporal change in the rainfall and runoff erosivity was purposely not considered, so as to focus on the C factor changes resulting from surface cover dynamics. The increase in potential erosion risk from the increase in connectivity in August could, however, be counterbalanced by the decrease in rainfall erosivity in the summer months, since precipitation maxima generally occur in spring and autumn in this region.

The resulting connectivity map shows that areas behave very differently with regard to connectivity, depending not only on the land cover but also on the spatial distribution of vegetation abundances and topographic barriers. Most parts of the catchment show higher connectivity values in August as compared to April (Sect. 4.4), indicating a generally lower vegetation cover in August and hence higher C factors and higher erosion potential. However, some areas are characterized by a decrease in connectivity, which can often be related to an increase in total vegetation cover from April to August on arable land. The two studied subcatchments have slightly different connectivity behaviour (Figs. 10 and 11) that mainly reflects the different topography and land cover proportions and their spatial configuration. This is in line with results from suspended sediment measurements (Francke et al. 2014) and spectral fingerprinting (Brosinsky et al. 2014) showing how sediment yields and sediment sources differ between the subcatchments.

6 Conclusions

This work has demonstrated the potential of high spectral resolution imagery for a catchment-wide bi-temporal mapping of vegetation abundance on a subpixel basis. Different from broadband imagery, this approach enabled the discrimination of both dry and green vegetation components, which together influence soil erosion processes and sediment fluxes. It is expected that this information can improve erosion model parameterization, which often builds on annually and spatially averaged empirical values. In this work, we derived spatially explicit RUSLE C factors based on airborne hyperspectral and LiDAR data as input in a connectivity assessment.

Knowledge of the spatial pattern of connectivity and its change over time is essential for sound land and water resource management and for understanding the potential environmental effects of induced changes (Lexartza-Artza and Wainwright 2009). For Mediterranean landscapes with heterogeneous vegetation cover, soil erosion potential will be better represented if connectivity, and hence the spatial distribution of sediment generation and transport, is taken into account (Sougnez et al. 2011). The IC (Borselli et al. 2008) applied in this work accounts for the topographical sequence of landscape properties and barriers. It is based on the ratio of hydrological distance to the stream network and the potential occurrence of upstream runoff. The index (i) may support the identification of hot spot connectivity areas in order to take actions to reduce or favour connectivity; (ii) may support assessing the effect of land use changes (e.g. due to land abandonment), land management practices and erosion control measures on soil erosion and sediment transport; and (iii) may improve understanding of the consequences of varying types of connectivity by incorporating connectivity information in soil erosion models.

The Isábena catchment has been chosen as an ideal study area because it experiences high erosion and sediment delivery rates, while connectivity effects are assumed to play an important role. Badlands can mainly be found in the Villacarli and Carrasquero subcatchments (6 and 2 % of their total area, respectively) and, to a lower degree, in the other three subcatchments of the Isábena basin (López-Tarazón et al. 2012). Furthermore, the Isábena catchment has been intensively monitored and studied during the past 10 years including modelling water and sediment transport using the process-based, spatially semi-distributed modelling framework WASA-SED (Mueller et al. 2010; Bronstert et al. 2014). Future work will include the incorporation of sediment connectivity information in the model to better reflect connectivity processes.

While this study is based on bi-temporal airborne data, advances in satellite remote sensing hold the prospect of quantitative, spatially explicit, catchment-wide derivation of surface information useful for connectivity analysis. These advances include a continuous increase in spatial image resolution to cover processes at the patch/inter-patch scale, an increase in temporal resolution to cover seasonal and long-term changes, and new multi-sensor missions enabling the simultaneous retrieval of various surface properties. Furthermore, upcoming hyperspectral satellite sensors, such as EnMAP, will provide high spectral resolution observations on a frequent and global basis that will allow the retrieval of biophysical surface parameters as input for hydrological catchment models.