1 Introduction

Soil erosion has become a global concern for sustainable livelihood and recognized as a serious problem throughout the world. The erosion process is accelerated rapidly due to direct and indirect participation of human activities. This process for highland soils seriously reduces soil fertility, aggregate stability, mean weight diameter and the hydraulic conductivity (Celik 2005; Chen et al. 2011; Demirci and Karaburun 2012; Paudel et al. 2014). The complex and dynamic process of soil erosion happens in two stages; the first stage involves the detachment of soil particles and the last stage transports the detached particles away either naturally or by the anthropogenic factors (Morgan 2005). Generally, cultivated area exhibits higher erosion (Brown 1984). The erosion process can also be triggered by natural agents such as climate change, tectonic activities or human activities or a combination of activities (Bocco 1991; Cantón et al. 2011).

Fig. 1
figure 1

Location of the study area.

Geographical Information System (GIS) is being used as an indispensable tool in various environmental problems of which the prediction of soil erosion is foremost. In fact, several soil erosion models can be embedded in the GIS platform. GIS is known as a powerful tool for collection, storage, management and retrieval of a multitude of spatial and non-spatial data in order to derive useful outputs (Srivastava et al. 2012, 2013; Lilhare et al. 2015). Over the years, there are a number of soil erosion models that have been developed with successful applications such as Universal Soil Loss Equation (USLE) (Wischmeier and Smith 1978), Water Erosion Prediction Project (WEPP) (Flanagan and Nearing 1995), Soil and Water Assessment Tool (SWAT) (Arnold et al. 1998) and European Soil Erosion Model (EUROSEM) (Morgan et al. 1998). From the catalogue of models, the empirical model like USLE (Wischmeier and Smith 1978) is one of the most applicable and more practicable to identify yearly soil loss in large scale as well as to view the management practices applied to control soil erosion. On the other hand, the process based model is only be able to calculate soil loss with rigorous data and calculation necessities (Lim et al. 2005). The contribution of vegetation cover to mitigate the soil erosion is considerably higher than the other factors available in the field. In general, soil loss has established a negative relationship with vegetation canopy (Elwell and Stocking 1976; González-Botello and Bullock 2012). Vegetation covers can actually absorb the kinetic energy of raindrops so as to reduce, to some extent, the soil erosion. As vegetation keeps the soil surface porous, it eventually increases infiltration capacity and thereby reducing surface runoff (Baver 1956).

The effect of vegetation cover and management factor (C) on soil loss particularly for a large study area is difficult to quantify accurately. Generally, the C factor estimation is carried out using literature and field data by simply assigning specific C value for a specific vegetation type (cover classification method) (Jürgens and Fander 1993; Folly et al. 1996). However, this technique gives the value of C factor that is identical for large area and unable to represent the distinguished features in vegetation in regional scale (Wang et al. 2002). The joint sequential co-simulation method is used to generate the C factor map based on point value with Landsat TM images (Gertner et al. 2002). However, fixing the appropriate points which can be used in interpolation for sampling is quite difficult and costly. Normalized difference vegetation index (NDVI) is another method to generate C factor map through the regression analysis. The correlation between NDVI and C factor was not satisfactory enough due to response of vegetation in vitality (De Jong 1994; Tweddale et al. 2000). In spite of these issues, the NDVI technique is still well recognized throughout the world (De Jong et al. 1999; Van der Knijff 1999; Hazarika and Honda 2001; Melesse et al. 2001; Lu et al. 2003; Najmoddini 2003; Cartagena 2004; Symeonakis and Drake 2004; Lin et al. 2006). Another technique known as Spectral Mixture Analysis (SMA) of satellite imagery Landsat ETM is used as a substitute option for estimating the C factor (De Asis and Omasa 2007). In SMA, the fraction of ground cover as well as bare land is counted which implies a relatively better estimation of soil erosion.

Based on the aforementioned issues, there are two objectives of this study; (1) to derive a vegetation cover and management factor (C) based on the eight experimental plots on the hillslope near the Guthrie Corridor Expressway (GCE) and (2) to estimate an average annual soil loss.

Table 1 Hydrological and physical properties of soils of the study area.

2 Materials and methods

2.1 Study area

This study is carried out on the hillslope near the Guthrie Corridor Expressway (GCE), Kuala Selangor, Malaysia (latitude \(3^{\circ }13'12.40''\)\(3^{\circ }13'27.30''\)N; longitude \(101^{\circ }30'29.30''\)\(101^{\circ }30'50.21''\)E). The altitude and slope steepness range from 45 to 75 m and 50 to 115%, respectively. The average annual precipitation is about 2663.34 mm and the mean maximum and minimum temperatures are 38 and \(26^{\circ }\hbox {C}\), respectively. There are eight square plots, 64 \((8\times 8\, \hbox {m})\) and 25 \((5\times 5\, \hbox {m})\), situated at two different locations as shown in figure 1. Each plot named as follows: Natural Bare Microbes (NBM), Natural Bare Non Microbes (NBNM), Planted Less Dense Microbes (PLDM), Planted Less Dense Non Microbes (PLDNM), Natural Dense Non Microbes (NDNM) and Natural Less Dense Non Microbes (NLDNM), are situated at the location 1 and Natural Dense Microbes (NDM) and Natural Less Dense Microbes (NLDM) are situated at the location 2. Table 1 presents detailed information about those plots. The name given for each plot is based on the two criteria, i.e., level of vegetation cover density and either the plot is microbes-treated or not. A large variety of plant species, viz., weeds, grasses, ferns and brushes height ranging from 0.5 to 2.0 m are available as vegetation canopy in the plots. The infiltration capacity is ranging from 3.18 to \(7.62\, \hbox {mm h}^{-1}\) implies that all plots are categorized under Group B till A+ based on the hydrologic soil group (HSG), which means infiltration varies from moderately-well to well-draining soils. In terms of soil texture, the experimental plots are classified under Group 2 and 3 whose soil texture varies from fine to coarse granular.

Fig. 2
figure 2

Flow chart of methodology.

Table 2 Comparison of rainfall erosivity factor (R) value.

2.2 Methods

The RUSLE model developed by Kanungo and Sharma (2014) is employed to assess soil erosion scenario in this study. The RUSLE equation is expressed as:

$$\begin{aligned} A=R\times K\times LS\times C\times P \end{aligned}$$
(1)

where A is the soil loss (\(\hbox {t ha}^{-1}\,\hbox {yr}^{-1})\); R is the rainfall erosivity factor (\(\hbox {MJ mm ha}^{-1}\, \hbox {h}^{-1}\, \hbox {yr}^{-1})\); K is the soil erodibility factor (\(\hbox {t h MJ}^{-1}\,\hbox {mm}^{-1})\); LS is the slope length and steepness factor (dimensionless); C is the vegetation cover and management factor (dimensionless), and P is the support practice factor (dimensionless). The overall methodology of this study is presented schematically in figure 2 and the subsequent paragraphs describe in detail about the preparation of RUSLE parameters.

2.2.1 Rainfall erosivity factor (R)

The potential erosion of a given rainstorm for a specific geographic location is represented by the rainfall erosivity factor (R) (Kinnell 2014). The R is the product of event kinetic energy (E) and the maximum 30 min rainfall intensity (\(I_{30})\) (Wischmeier and Smith 1978). As the temporal resolution of the precipitation is not suitable to calculate the \(\hbox {EI}_{30}\) for the study area, thereby, R is calculated based on the mean annual precipitation (P). To calculate the R factor, several formulas have been developed based on the mean annual rainfall throughout the world. This study uses three different empirical equations developed by Roose (1977), Morgan (2005) and Teh (2011) to determine the R factor. The annual precipitation of the study area for the year of 2000 until 2013 varies from 2237 (minimum rainfall recorded in 2002) to 3048 mm (maximum rainfall recorded in 2008). Based on the average annual precipitation, the R factor is calculated by employing three different equations as shown (table 2). The best estimation should be the average of two R values calculated by Morgan and Rose equations (Mir et al. 2010) that gives the average value of \(2323.25\, \hbox {MJ mm ha}^{-1}\hbox {h}^{-1}\,\hbox {yr}^{-1}\). This final value is used as R factor in this study.

2.2.2 Soil erodibility factor (K)

The soil erodibility factor (K) is about the inherent erodibility of soil. It is a measure of susceptibility of soil to be detached and transported by precipitation and surface runoff. K factor can be determined experimentally by integrating several soil characteristics such as texture, structure, organic matter content and permeability. The nomograph can also be used for calculating the K value; however, the K factor shows significantly overestimated for Malaysia soil series. Conversely, the K factor for the study area is determined based on the observation data and the analytical relationship. The analytical relationship developed for Malaysia soil series is as follows:

$$\begin{aligned} K= & {} [1.0\times \left( {10^{-4}} \right) M^{1.14}\left( {12-OM} \right) \nonumber \\&+4.5\left( {s-3} \right) +\,8.0\left( {p-2} \right) ]/100 \end{aligned}$$
(2)

where K is the soil erodibility (\(\hbox {t h MJ}^{-1 }\hbox {mm}^{-1})\), OM is the percentage of organic matters, s is the soil structure code and p stands for the permeability class and M is particle size parameter calculated as follows:

$$\begin{aligned} {M}= & {} (\% \hbox {silt} + \% \hbox {very fine sand})\nonumber \\&\times (100 - \%\hbox {clay}). \end{aligned}$$
(3)

2.2.3 Slope length and steepness factor (LS)

The associated effects of slope length (L) and slope steepness (S) on soil erosion can be computed as a single index by slope length and steepness factor (LS) (Wischmeier and Smith 1978). Numerically, the LS factor is the ratio of soil loss on site to a corresponding plot with 22.1 m length and 9% slope.

$$\begin{aligned} LS=\left( {X/{22.1}} \right) ^{m}\left( {0.065+0.045S+0.0065S^{2}} \right) \!. \end{aligned}$$
(4)

As shown by equation (4), the LS factor consists of two parameters, i.e., X is the slope length (m) and S represents slope steepness in percentage. These parameters have been derived from Digital Elevation Model (DEM). X value is computed by multiplying the flow accumulation with cell value derived from DEM after performing the FILL and flow direction processes in GIS.

$$\begin{aligned} X=\text {Flow accumulation}\times \text {cell value}. \end{aligned}$$
(5)

By substituting X value, LS equation can be expressed as

$$\begin{aligned} LS= & {} \left( \text {Flow accumulation}\times \frac{ \text {cell value}}{22.1}\right) ^{m}\nonumber \\&\times (0.065+0.045S+0.0065S^{2}). \end{aligned}$$
(6)
Table 3 m-value for slope length and steepness factor (Wischmeier and Smith 1978).
Table 4 Ground and classified images of plots.

The value of m varies from 0.2 to 0.5 depending on the slope as shown in table 3.

2.2.4 Support practice factor (P)

The support practice factor reflects the impact of support practice on the average annual erosion rate. It is defined as the ratio of soil loss with the specific support practice to that of straight row farming-up and down the slope. This factor accounts for control practices that reduce the potential soil loss with value ranges from 0 to 1 indicating from good to poor conservation practice, respectively. In this work, however, the P factor is not taken into account, assuming no support practice in the study area, therefore P factor is assumed to be 1.

2.2.5 Vegetation cover and management factor (C)

The C factor reflects the effect of cropping and management practices on soil erosion rates in agricultural land. The vegetation cover and management factor (C) is calculated as the ratio of soil loss caused by a land with a given vegetation type to a soil loss from the bare condition. The capability of a vegetation cover to alleviate erosion depends on its height and continuity, density and the root growth. The vegetation cover helps to dissipate the kinetic energy of falling raindrop and thereby protecting the topsoil from being eroded.

3 Results and discussions

3.1 Vegetation cover and management factor derivation (C)

As the derivation of C factor is the main focus of this study, a detailed procedure is presented here. The images of vegetation cover for all plots are taken by using a ground-digital camera and those images are processed using an image classification tool in the ArcGIS10.1. Methods used for processing and classifying the images are supervised classification and maximum likelihood algorithm. The maximum likelihood algorithm has been extensively used throughout the world for classifying satellite images (Vorovencii 2005). According to this method, pixels with the similar spectral value can be categorized under specific classes and these similarities or likelihoods are identical for all classes and that the input bands are uniformly distributed. This method, however, is a time consuming method and provides output based on normal distribution of data in each band (Vorovencii and Muntean 2012). The coverage of vegetation ground cover is determined through the ratio of pixels contained by vegetation ground cover and plot area individually.

In Malaysia, the C factor retrieval was done by considering three land use types, i.e., replicated forest and undisturbed land, agricultural and urbanized area and best management practices (BMPs) at construction sites (Department of Irrigation and Drainage, Malaysia 2008). All plots are considered under the third category. As the guideline for retrieving the C factor for the studied plots is unclear, alternative method is suggested whereby the calculation of C is based on the similar land use type used by previous researchers. As a result, this study has utilized several established formulas for the C value computation. First, Wischmeier and Smith (1978) model that requires only a percentage of vegetation ground cover and types of vegetation as input parameters. Those vegetation types are: (a) grasses or weeds with no appreciable canopy; (b) tall weeds or short brush of 0.5 m fall height with surface cover of decayed litter at least 50 mm deep or surface cover of undecayed residue; (c) appreciable brush of 2 m height with surface cover of decayed litter at least 50 mm deep or surface cover of undecayed residue, and (d) tree but no appreciable brush of 4 m height with surface cover of decayed litter at least 50 mm deep or surface cover of undecayed residue. Second, the C factor determined by several other models proposed by researchers (Bubenzer and Ka 1980; Israelsen et al. 1980; Troeh and Donahue 1999; ECTC 2003; Kuenstler 2009; Layfield 2009). In these methods, the corresponding C factor is assigned for a certain type of vegetation with a specific vegetation density. All the C values obtained by the first and second methods are then linking to the vegetation ground cover.

Table 5 Vegetation cover and management factor (C) derivation method.
Table 6 Vegetation cover and management factor (C) derivation by Bubenzer and Ka (1980); Israelsen et al. (1980); Troeh and Donahue (1999); ECTC (2003); Kuenstler (2009); Layfield (2009).
Table 7 The average of C factor.
Fig. 3
figure 3

The computed C factor values.

Table 8 Relationship between VGC and C factor.
Fig. 4
figure 4

(a, b and c) Soil loss per year and RUSLE parameters for different plots.

Now, all plots have a set of two C values, whereby a final value is calculated by averaging those two values. Finally, the relationship between vegetation ground cover (VGC) and the average C factor is determined. After the preparation of RUSLE parameters such as R, K, LS, P, and C as raster layers, the average annual erosion is finally obtained by integrating the RUSLE model in a GIS platform. Using the raster calculator in spatial analyst tool, the soil erosion risk map is then generated.

3.2 Result of the C factor

A percentage of vegetation density is computed and examples of images taken for three selected plots are shown (table 4). Overall result indicates the actual amount of vegetation cover for all plots, varies from 38.96 to 99% with an average of 70%. A relatively high density of vegetation is found at the NDM, NDNM and NLDNM plots. As the ground vegetation cover and the C factor are directly related to each other, it is likely to calculate the C factor. Therefore, the next step is computation of the C factor using methods as discussed in preceding section. Tables 5 and 6 present the computed C factor by different models proposed by several researchers. The C factor derived by Wischmeier and Smith (1978) method as shown in table 5, is found to vary from 0.012 to 0.150 for all plots with the lowest and highest at NLDNM and NBM plots, respectively. Meanwhile, the C factor computed using other researchers’ models (Bubenzer and Ka 1980; Israelsen et al. 1980; Troeh and Donahue 1999; ECTC 2003; Kuenstler 2009; Layfield 2009) gives a value varying from 0.02 to 0.1 with plots from dense and less dense vegetation cover and NBM, respectively. Types of vegetation covers are also provided in both the tables. Table 7 summarizes the computed C value whereby the average of C value is given in the last column. If the retrieval of C factor values for a particular location of identical land use type is varied within a reasonable range, then an average of those closely equal values could be taken as a C factor. In accordance with the Department of Irrigation and Drainage, Malaysia 2008, the calculated average C factor would be accurate one for those plots. Accompanying table 7 is figure 3 that represents graphically the computed C value. Finally, the newly-derived C value is then used to determine a relationship between the C factor and vegetation cover. Table 8 shows the equation that relates the C factor and vegetation ground cover (VGC) for all the aforementioned methods with the \(R^{2}\) values found to be encouraging for all plots. It should be noted that the ground cover for all plots is mostly fern species except for NDNM and PLDNM plots that are covered by shrubs. The vegetation speciesmentioned here serves as a reference only; the role of vegetation species in calculating the C factor, however, is not considered.

Table 9 Potential soil loss of all plots.
Fig. 5
figure 5

Comparison of the predicted erosion.

Fig. 6
figure 6

Correlation of the C factor between the newly-derived with the C from guideline.

Table 10 A consequence of incremental C factor on yearly basis soil loss.

3.3 Soil erosion

Soil erosion estimation is obtained based on the RUSLE model in the GIS platform that requires map layers for each RUSLE parameter before overlay process can be done in a raster analysis environment. Map layers of all RUSLE parameters and soil erosion prediction map are generated by GIS using the newly-derived C value as illustrated in figure 4(a–c). The value of C factor produced by this study is compared with that of the erosion guideline in Malaysia, i.e., DID (2008). According to the guideline, the C factor for the study area is about 0.8, which is larger than the ones obtained by this study. Table 9 shows potential erosion generated by the C value from both methods. Apparently, the erosion value is slightly different and in fact it is much higher based on the C value from the guideline. The minimum and maximum values of erosion based on the C value computed using the proposed method is 0.410 and \(3.925\, \hbox {t ha}^{-1}\,\hbox {yr}^{-1}\), respectively. Meanwhile, the minimum and maximum erosions obtained based on the C value of 0.8 are 9.367 and \(34.496\, \hbox {t ha}^{-1}\,\hbox {yr}^{-1}\), respectively. Accompanying table 9, figures 5 and 6 illustrate the comparison of erosion of both methods. It is noticed that correlation is relatively high implying a good agreement and is clearly seen that the proposed method is acceptable. Besides, table 10 shows increment in the C value as compared to the C value from the proposed method, that is derived based on the actual vegetation density. Finally, report produced by DOE (2003) regarding the classification of erosion as referred (table 11). There are five classifications of erosion with the lowest and highest being less and more than 10 and \(150\, \hbox {t ha}^{-1 }\,\hbox {yr}^{-1}\), respectively. Based on this table, the predicted erosion from the proposed method for all the plots are classified as very low erosion, i.e., <10  \(\hbox {t ha}^{-1}\,\hbox {yr}^{-1}\), whereas the erosion computed based on the C from guideline shows all plots are within the low erosion, i.e., between 10 and \(50\, \hbox {t ha}^{-1}\,\hbox {yr}^{-1}\). Based on the site observation, we found that low erosion is likely to occur due to (i) slope length of the plots is relatively small to generate much amount of overland flow and hence, there is a very little to no contribution of overland flow in total sediment yields. In general, the overland flow plays a major role for erosion to occur due to its high kinetic energy developed in a large slope length, while moving to down slope; (ii) naturally developed high dense vegetation coverage (lower vegetation cover and management factor) is another reason that alleviate soil erosion; and (iii) soil grain of the study plots is medium to coarse granular and hence, leading to high infiltration rate.

Table 11 Soil loss tolerance rate (erosion risk map of Malaysia).

4 Conclusions

This study has proposed an alternative method of retrieving the C factor of the RUSLE model of which it seems more reliable as it is derived based on the actual existence of vegetation coverage. The density of vegetation can be associated to the C factor as several mathematical models had been established by researchers. This study has proved that the newly derived C value is more reliable when it comes to erosion prediction. Here, comparison was made with the guideline of soil erosion in Malaysia published by DOE and DID, whereby the erosion rate from the guideline was found slightly higher. The C factor stated by the guideline for the same study plots was moderately high that eventually lead to higher erosion rate. There are three points that can be concluded here: (1) the predicted erosion based on the proposed method has produced somewhat reliable result and is classified under the very low erosion, which is below \(10\, \hbox {t ha}^{-1}\,\hbox {yr}^{-1}\); (2) the proposed method is feasible as it is derived based on the actual density of vegetation cover; and (3) the proposed method is not only feasible for the plot scale, but can also be applied to a larger area provided that images can be captured. It is suggested to use an advanced remote technology like unmanned aerial vehicles (UAV) so that overall image can be captured. The proposed method is able to overcome the problems arising in simply assigning the C value of specific vegetation type based on the literature and field data which end up with the value that was identical to a large area and unable to represent the distinguished features in vegetation.