1 Introduction

Approximately one million ha of water and soil conservation practices are implemented and maintained annually in the framework of the Watershed Management Program in Iran. Mechanical measures in waterways include the construction of many check dams and small dams with the purpose of stabilizing the longitudinal slope of tributaries, flood control, water supply, and aquifer recharge, as well as increasing the discharges of Qanats (a system for transporting water from an aquifer or water well to the surface, through an underground aqueduct). These dams serve as the sites where the sediments (characterized with different forms of bed load and suspended load flowing from upstream) are captured and subsequently accumulated.

According to the Guidelines of Watershed Management Studies in Iran, the amount of sediment production in small watersheds and tributaries is estimated by the PSIAC semi-quantitative scoring model (PSIAC 1968). In this model, nine factors that affect erosion and sediment production are involved (de Vente 2009). Arabkhedri et al. (2018) conducted a national survey of reservoir sedimentation in 74 small dams and found conflicting results between the observed and PSIAC-estimated amounts even in relatively similar adjacent drainage basins. For example, in small adjacent basins in the Loess area of Golestan Province (having similar rainfall, soil, land use, vegetation, topography, and therefore similar erosion potential), the survey found more than 10 times differences in specific area sedimentation of the reservoirs. However, the analogous estimates derived from the PSIAC model showed only a difference of 2.7-fold. This highlights the need to pay closer attention to uncertainties associated with routing of sediment through watersheds (Owens 2020) and the sediment delivery ratio (SDR) component of models that estimate sediment yield.

The SDR is obtained by dividing the catchment sediment yield by the total upstream gross erosion (GE) and represents the fraction of eroded material delivered to the basin outlet. Despite the theoretical elegance of SDR, experimental SDR models have not been very successful so far, for a variety of reasons. In particular, their lumped structure and black box nature (Walling 1983) and the inability of conventional experimental models such as USLE (Wischmeier and Smith 1978) to quantify the actual gross erosion components, including sheet, rill, gully, river bank, and mass movements within a watershed were used in some previous studies (e.g., Williams 1977; Diodato and Grauso 2009; Mirakhorlo and Rahimzadegan 2020). The widespread use of remote sensing and geographic information system techniques (Borselli et al. 2008; Foerster et al. 2014) led to development of some distributed SDR models (Ferro and Porto 2000).

The term “hydrologic connectivity”, introduced in recent years (Borselli et al. 2008), is an emerging concept in hydrology and geomorphology for quantifying the status and spatial distribution of runoff and sediment flow from upstream to downstream (Bracken et al. 2013; Sidle et al. 2017). It might be described at multiple scales including hillslope connectivity, hillslope-channel connectivity, and connectivity between multiple river systems among others (Bracken et al. 2015). The impact of any barrier against runoff and sediment flow varies on its size and location in the watershed (Fryirs et al. 2007). As a whole, multiple hydrogeomorphic processes occurring within a catchment affect connectivity in a watershed. For example, as watershed scale extends, the main sink area changes from the foothills toward the flood plains, in turn affects the amount of sediment exported from the watershed (de Vente et al. 2005; Sidle et al. 2017). Therefore, given the variability of the hydrogeomorphic parameters throughout an assumed catchment, a spatially distributed connectivity approach may provide more accurate SDR than lumped models. As the review by Najafi et al. (2021) showed, a few numbers of literature (five out of 117 reviewed papers) investigated the sediment connectivity concept to estimate the SDR such as Vigiak et al. (2012) and Heckmann and Vericat (2018). It has been emphasized that applying sediment connectivity index alleviates complexity and data requirements of current erosion and sedimentation models (Najafi et al. 2021).

In order to quantify the connectivity at watershed scales, Borselli et al. (2008) proposed a quantitative distributed index of connectivity (IC) in GIS environment. Drawing upon topographic, land cover, and landscape data, they inferred that this index enables researchers to determine the erosion hot spots within the watershed. Sougnez et al. (2011) studied the correlation between basin area SSY and two different metrics of IC. The area-specific number of connected pixels showed a significant positive relationship (P < 0.05), while IC at the outlet did not show a meaningful correlation to the SSY.

Cavalli et al. (2013) modified the IC by applying high-resolution topographic from LiDAR. Higher spatial resolution will provide detailed information on catchment characteristics and improve the accuracy of the generated map (Borselli et al. 2008; Najafi et al. 2017). Some others found that very high-resolution DEM is not suitable due to numerous disconnections (Lisenby and Fryirs 2017). Despite the attractiveness of high-resolution spatial IC, its computational processing time should be evaluated for the study of large river basins (Heckmann and Vericat 2018). On the other hand, the absence of such data in many areas (Najafi et al. 2021) largely limits its use. The above literature review indicates that, notwithstanding numerous attempts at the algorithm improvement, one of the existing gaps is the paucity of comparisons between connectivity index and observed sediment yield at the basin outlet. To shed more light on the delivery problem, since the many factors that affect SDR are analogous to those of sediment yield (de Vente et al. 2007), it is reasonable to examine the relationship between sediment yield and sediment connectivity, particularly in similar basins in terms of erosion processes and features. In addition, the minimum resolution of the input layers for producing an accurate IC estimate in absence of enough data as well as for saving time is questionable.

The present research has three main objectives:

  1. i.

    To assess the IC in small watersheds with similar erosion potential,

  2. ii.

    To investigate the strength of relationship between IC and actual SSY in order to gain a better understanding of sediment delivery rather than the conventional models

  3. iii.

    A comparative evaluation of the effect of different input data resolution on the trend of calculated IC.

2 Materials and methods

2.1 Study area

The study area includes 11 sub-watersheds of Shour-Dareh basin, in Golestan province, Iran (Fig. 1). The average altitude of this basin is 544 m above mean sea level and mean annual precipitation and temperature is 585 mm and 17.6 °C, respectively. Each study sub-watershed is characterized by a small earth dam at the outlet where the transported sediment was deposited (Fig. 1D). Table 1 shows characteristics of the sub-watersheds. As it can be seen, they vary by size (48 to 498 ha) and shape; however, they are mostly identical based on lithology, soil, climate, slope, land use, vegetation, and erosion features. Therefore, it can be expected that the erosion rates of these sub-watersheds are close to each other and variations in their SSY are mostly due to differences in connectivity and sediment delivery to the outlet.

Fig. 1
figure 1

Location map of Shour-Dareh sub-watersheds (A, B, C) and a view of sedimentation behind a dam (D)

Table 1 Some characteristics of 11 Shour-Dareh sub-watersheds, Golestan, Iran

The surface lithology of the Shour-Dareh basin is mainly Loess. The region is characterized by Mediterranean climate according to the de Martonne aridity index (Ghorbani et al. 2016). There are three dominant land uses: shrubbery (8%), rangeland (65%), and dry farming (27%). The erosion map was produced using the Google Earth Satellite imagery and field survey (Parsamehr et al. 2014) showing that the sheet and rill erosion are dominant. All dams have been constructed in 2005. They are 8 m high except SW7 with a height of 9 m. A topographic survey was carried out inside the reservoirs during dry season in 2014 after 9 years. Then, sedimentation volume was calculated by subtracting the resurveyed capacity from the original reservoir capacity. In the next step, the average bulk density of sediment samples taken from profiles in each reservoir was calculated, in order to convert sediment volume to mass. The observed annual area-specific SSY ranges from 1.80 to 19.8 t ha−1 year−1. Estimated SSY with the PSIAC scoring model (PSIAC 1968) is presented in the penultimate column of Table 1. The range of estimated values is much less than the measured values; however, no trend is seen between these two data sets.

2.2 Estimations of connectivity index

Sediment connectivity comprises an important part of the current research, which is briefly described in this section. The IC is calculated using the following equation, with two upslope (Dup) and downslope (Ddn) components (Borselli et al. 2008).

$$\mathbf{I}\mathbf{C}={\mathbf{log}}_{10}\left(\frac{{\mathbf{D}}_{\mathbf{u}\mathbf{p}}}{{\mathbf{D}}_{\mathbf{d}\mathbf{n}}}\right)$$
(1)

Figure 2 graphically illustrates the factors involved in the two components Dup and Ddn and their measuring and calculating details. For each assuming reference pixel on the map, an upslope contributing basin and a downslope path to the outlet are assumed. Dup factors including area (m2), weighting factor, and slope (m m−1) are obtained from the former, and Ddn factors including slope (m m−1), distance to the outlet (m), and weighting factor are obtained from the latter. A lower limit of 0.005 m m−1 and an upper limit of 1 m m−1 for slope gradient have been considered, in order to avoid unrealistic results of IC model in Eq. 1 (Cavalli et al. 2014).

Fig. 2
figure 2

Schematic illustration of the upslope and downslope components of connectivity for a reference pixel in watershed scale (adapted from Borselli et al. 2008)

Topographic factors including the upslope area, slope gradient, and downslope flow pass are extracted from the DEM. For this area, a DEM with a 30-m spatial resolution (900 m2 cell size) was available. To hydrologically correct the raw DEM data, first, commonly used pre-processing algorithms (e.g., filling sinks) were applied (Jarihani et al. 2015). Then, the corrected DEM was used to create flow directions, flow paths, contributing areas, as well as delineate sub-catchments. Topographically derived flow paths are areas where surface water would likely concentrate within the catchment, hence providing important information on sediment connectivity.

The parameter W, which represents the local impedance of land surface to runoff and sediment transport, is usually obtained from vegetation data. In the present research, drawing upon Borselli et al. (2008) and later modified and tested by Cavalli et al. (2013), vegetation factor (C) of Universal Soil Loss Equation was set equal to W variable value. The C-factor is calculated from the NDVI index as per with Eq. 2 (Jarihani et al. 2015), where the NDVI is derived from the visible and near-infrared light reflected by vegetation. The C-factor ranges from 0 to 1, where C=0 represents a dense vegetation cover that protects the soil surface against raindrop impact and C=1 denotes a bare soil (Wischmeier and Smith 1978; Renard et al. 1997; Borselli et al. 2008).

$$\mathbf{C}=\frac{\left(1-\mathbf{N}\mathbf{D}\mathbf{V}\mathbf{I}\right)}{2}$$
(2)

In this study, Landsat 8 satellite images (spatial resolution, 30 m) were used to map vegetation cover. As vegetation varies over the year, choosing an appropriate time is an important challenge. Given that most erosive rainfall events of Golestan province and the consequent flood and erosion are likely to occur in the late summer and early autumn (Arabkhedri et al. 2010), when the vegetation cover percentage is low, it was determined to choose satellite image during the same period. Hence, all the available images during this time period were pre-processed and quality checked. Finally, an image acquired at 3rd of November 2016 with the minimum cloud cover and higher accuracy was selected for the NDVI calculation.

Subsequently, based on the input layers and Eq. 1, an IC map was generated using the Connectivity Toolbox for ArcGIS 10.2.2 (Cavalli et al. 2014). This is followed by extracting four IC statistics/metrics, including average, median, area-specific number of connected pixels, and IC at the outlet for each sub-watershed which are called as IC30 statistics/metrics.

At this stage, the relationship between observed and estimated SSYs vs IC30 statistics/metrics of 11 sub-watersheds was examined based on their correlation. Assuming the approximate uniformity of the study catchments as per their erosion potential, the high correlation between IC and SSY would infer as the appropriacy of IC for improving sediment estimation models. In the next step, the correlation between factors involved in the Eq. 1 (including area, basin slope, main waterway length, and vegetation) vs. the SSY and the most appropriate IC30 statistics/metrics was determined and interpreted.

After that, both input layers (i.e., DEM and NDVI) were resampled from its original pixel size (=30 m) into coarser grid sizes including 50, 100, and 200 m. Then, average IC50, IC100, and IC 200 were calculated for 11 sub-watersheds correspondingly.

Finally, the correlation of SSY vs. IC50 and IC100 and IC200 was calculated as for the IC30, and their trend was examined. Little to no change in the correlation coefficient would indicate that the coarser resolution input data (> 30 m) is still appropriate in order to accomplish an IC map that could explain the wide range of SSY at the basin outlets very well.

3 Results and discussion

3.1 C-factor and slope maps

The USLE C-factor map of study sub-watersheds was extracted from Landsat 8 satellite image, based on NDVI index, and then, a percentage distribution histogram of this variable was plotted in 10 classes. According to the histogram, although the C-factor values range from 0.235 to 0.658, about 98% of the pixels belong to four classes that range between 0.350 and 0.450 and are largely in line with the normal distribution. The total average C-factor for the region is 0.388. Among the sub-watersheds, SW2 accounted for the highest with 0.402 and SW5 that had the lowest mean of 0.379 which indicates the similarity among all basins in terms of vegetation cover.

Based on the slope map of sub-watersheds, approximately 63% of the study area falls into classes ranging from 0.12 to 0.36 m m−1. The average slope of the study area is 0.259 m m−1 and the lowest and highest mean slopes were observed in SW2 (0.221 m m−1) and SW1 (0.290 m m−1), respectively, which indicates that the sub-watersheds have similar slopes.

3.2 Sediment connectivity index map

The sediment connectivity index map (30-m resolution) of the studied sub-watersheds with a magnified view of their outlet area is shown in Fig. 3A. It is observed that the values of this index at basin outlet and main streams are higher than upland area. This implies an increased likelihood of sediment transport in main channel rather than tributaries. Despite wide range of IC from −6.907 to 2.313 (Fig. 3C), approximately 99% of the pixels falls in four classes that ranged between −5.3 and −1.7 and ~70% of pixels in the class −4.4 to −3.5 (Fig. 3B).

Fig. 3
figure 3

IC30 map of 11 sub-watersheds with magnification in their outlet area (A), frequency distribution chart for IC30 (B), and box plot of IC30 by sub-watersheds (C)

3.3 Relationship between sediment yield and sediment connectivity index

Correlation analysis of four independent statistics/metrics obtained from the IC30 map with the dependent variable (SSY) showed that all of them have a significant relationship. The highest correlation coefficient was related to the average IC (0.84), followed by median IC (0.81) both significant at the level of 1%. The correlation coefficient of the area-specific positive pixels (0.69) and IC at the outlet (0.56) was still significant at the level of 5%. This comparison indicates the superiority of the average IC and median IC to express the SSY of the studied sub-watersheds. The median IC is most likely to work better in basins where the value of IC pixels does not follow the normal distribution. In a study, Sougnez et al. (2011) also showed that the area-specific positive pixels metrics had a higher correlation with SSY than IC at the outlet.

Based on the above comparisons, the average ICs of the study catchments were used in the subsequent analyses. The linear correlation of the average IC30 as dependent variable and the four factors involved in the Eq. 1 (mean C-factor, mean slope, area, and main waterway length) as independent variables in the 11 sub-watersheds were 0.54, −0.38, −0.63 (sig < 0.05) and −0.74 (sig < 0.01), respectively, while the corresponding correlation coefficients of the annual SSY with these factors were obtained 0.25, −0.34, −0.62 (sig < 0.05), and −0.47. Comparison of the above correlation coefficients shows that the direction of correlations of both dependent variables is the same; however, the correlation strength (from high to low) is different. The correlations between independent variables (especially area and slope) with the IC are stronger than those corresponding to the SSY. The direction of correlation of slope with SSY (although not statistically significant) is contrary to what was expected. The reason for this can be attributed to the interaction of area and main waterway length with the slope (Gregory and Walling 1973).

Figures 4A, B, C, D, and E depict the scatter plot and best fitted regression line (only for significant correlations) of observed SSY vs. average IC30, area, main waterway length, mean slope, and C-factor, respectively. Considering Fig. 4A and given the average values of IC30 (Fig. 3C) and observed SSY data (Table 1), it can be seen that the lowest SSY (1.8 t ha−1 year−1) corresponds with the lowest IC value (−4.485) in SW1 and the highest SSY (19.8 t ha−1 year−1) corresponds with the highest IC value (−3.278) in SW6 watershed among others.

Fig. 4
figure 4

Scatter diagram of SSY vs. IC30 (A) and four other independent variables (B, C, D, E) in Shour-Dareh sub-watersheds

The relationship between the annual SSY and average IC was found to be linear with R2 = 0.71 (Fig. 4A). In other words, the average connectivity index obtained from Landsat 8 image with 30-m spatial resolution and 30 m DEM explains 71% of the SSY variations, which is very satisfactory compared to the R2 of relationships between the SSY and four components involved (Fig. 4B, C, D, and E) in Eq. 1 (R2 slope = 0.12; R2 area = 0.33; R2 waterway length = 0.50; and R2 C-factor = 0.06). Taking to account the similar erosion potential of study sub-watersheds, the high coefficient of determination proves that the IC is superior to explain the large sediment variations in watersheds. In comparison, no relation was found between the PSIAC estimates and IC30. Despite the strong correlation between the observed SSY and IC30 in this study, as Najafi et al. (2021) states, more studies are needed to attain a comprehensive conceptualization of sediment yield using sediment connectivity.

3.4 Evaluating the effect of resolutions

Figure 5 compares the trend of four ICs, different by pixel sizes (30, 50, 100, and 200 m), for 11 sub-watersheds. As it can be seen, two small sub-watersheds 6 and 11 are inconsequential at grid size 200 m. There are two reasons: first, the small size of these watersheds (40 and 39 ha, respectively) and the second, their elongated shape, so that at least a part of large 200 m grids are placed outside the basin and thus be eliminated. It should be noted that according to the IC computational algorithm (Cavalli et al. 2014), border pixels of the basins are not included in the IC calculation. This phenomenon also caused a large difference in the IC variation trends of two elongated sub-watersheds 2 and 9, which are larger than sub-watersheds 6 and 11.

Fig. 5
figure 5

The influence of four different grid size of input map (30, 50, 100, and 200 m) on sediment connectivity indices for 11 Shour-Dareh sub-watersheds

As a whole, excluding sub-watersheds 2, 6, 9, and 11, as the pixel size increases, the IC represents an increasing trend which are in agreement with the results of Cantreul et al. (2018) and Zanandrea et al. (2020). The reason for this trend, in addition to the removal of peripheral pixels, most likely is the effect of smoothing topographic shapes (such as rivers, hills, etc.) and the subsequent decrease in slope with increasing pixel size (Zanandrea et al. 2020). In other words, by increasing the pixel size from 30 to 200 m, due to the reduction of terrestrial features (topographic roughness, vegetation) and the elimination of disconnects (local sinks) that exist in the sediment flow path, connection is increased.

Comparison of correlations between ICs of 11 sub-watersheds obtained from different pixel sizes with the observed SSY implies a weaker correlation with larger pixel size. It can be inferred that larger pixels (100 and 200 m) should not be used at least in small watersheds.

4 Conclusions

This study investigated the correlation between the four statistics/metrics of IC (at a 30-m resolution) and the measured annual SSY of 11 sub-watersheds in the Shour-Dareh basin of Golestan province, Iran. According to the result, the SSY showed powerful correlations with the most of IC30 statistics/metrics particularly average and median IC30. Among all metrics, the average IC30 showed a powerful relationship (R = 0.84), which, in comparison, was much stronger than the correlation of SSY with the factors involved in IC equation (area, slope, main waterway length, and vegetation density). Given the relative similarity between these basins (considering the factors affecting erosion), average IC30 was able to explain 71% of the 11-fold difference in the sediment production. The developed regression equation can be used for sediment estimation of small similar basins in the Loess region of Golestan province. Nevertheless, it should be noted that for the upstream watershed of each desired outlet, the IC value must be estimated separately. Considering the advantages of the IC in accurately specifying and mapping small basins and polygons, where the sediment delivery and sediment connectivity is high, rather than experimental lumped models, it is recommended to update the erosion and sedimentation part of Guidelines of Watershed Management Studies in Iran by adding the IC procedure. To put it another way, it should be noted that the IC with low data requirements on the one hand and the simplicity of calculating it with reliance on GIS and RS capabilities on the other provides a robust framework for replacing the outdated lumped SDR relationships.

This study also calculated the connectivity indices for coarser pixel sizes (of up to 200 m) through resampling and found that a larger pixel size in turn weakened the relationship of the IC with the SSY (specifically in basins of up to several square kilometers). Larger pixels are likely to fit in larger basins. Clearly, the larger the pixel size, the lower the computational volume will be, which is a necessity to accelerate work in large basins.

One of the limitations in this study was the lack of a high-resolution DEM (accuracy greater than 30 m) for the study area, which might ultimately lead to a discrepancy in the results of a study conducted with pixels smaller than 30 m. Therefore, further investigation is required to indicate the correlation trend of higher resolution ICs with the observed SSY.