Introduction

Soil erosion is a natural occurrence, but it can also be exacerbated by human activities, particularly in the context of agriculture. To mitigate its effects, it is essential to conduct thorough monitoring and assessment to design effective preventative measures (Shojaei et al., 2020). According to Mondal et al. (2015), factors like climate change and land use/land cover (LULC) can influence erosion losses indirectly. Acquiring current and comprehensive data on LULC patterns and understanding their influence on natural disasters is crucial for the accurate monitoring of shifts and the prediction of potential changes in land resources over time. Di Gregorio and Jansen (2000) detected changes in LULC coverage worldwide. It is essential to note that LULC is not constant; rather, it evolves over time in response to the changing relationships between landowners and immediate causes (Lambin et al., 2003). The increase in anthropogenic activities due to modernization has led to changes in urban land use and shifting patterns in land cover (Mouat et al., 1993). Research conducted over the past several decades shows that indiscriminate use of land for developmental activities, especially in villages and rural areas which tend to lack proper planning, guidelines and regulations for land resources, has led to exploitation, mismanagement and degradation of natural assets (Jusoff & Hassan, 1998). Many researchers have discussed the application of imagery of earth collected by satellites for mapping and tracking LULC characteristics (Yu et al., 2011). Traditional techniques for tracking fluctuating LULC and anticipating cover shifts include remote sensing (RS) and GIS (Arsanjani & Vaz, 2015). Recently, the machine learning algorithm such as ANN, Support Vector Machine (SVM), Navie Bayies (NB), and RTC has been widely utilized for conceptual information and has been shown to be a reliable approach for classifying LULC. This strategy has been incorporated in a wide range of fields which includes LULC assessment, predicting meteorology, preparing geological maps, environmental tracking, etc. (Arabi Aliabad et al., 2019; Wang et al., 2004).

Statistical methods, CA models, evolutionary algorithms and neural networks have been used by researchers as tools for land tracking and forecasting of LULC change (Ahire et al., 2012; Akdeniz et al., 2023; Karimi et al., 2018; Matta et al., 2020; Mozaffaree Pour et al., 2022; Shafie et al., 2023). Nowadays, spatiotemporal land cover prediction models like CA–Markov, CA–ANN, CA–Markov–ANN have found widespread application (Aksoy & Kaptan, 2021; Feng & Liu, 2016; Floreano & de Moraes, 2021; Lu et al., 2019; Motlagh et al., 2020; Roose & Hietala, 2018; Shahbazian et al., 2019; Zadbagher et al., 2018). From various literature study, ANN models found to be effective when combined with other methods for projecting LULC changes in a specific region (Ahmad et al., 2023; Jayabaskaran & Das, 2023; Pandey & Kumari, 2023; Ragini et al., 2023; Roy & Chintalacheruvu, 2023). By incorporating CA-ANN models, LULC predictions can be made using an open source GIS platform such as QGIS (Quantum Geographical Information System) along with plugin named MOLUSCE (Modules of Land use and land change evaluation) (Abbas et al., 2021; Bhandari et al., 2021). This approach has the benefit of eliminating redundant input components during data training. ANN, Weight of Evidence (WoE), and Multi Criteria Assessment (MCA) are a few additional techniques interconnected to CA method (Kamaraj & Rangarajan, 2022). Projected LULC maps can be widely applied in devising solutions to urban sprawl, planning, identification of urban heat islands, transportation applications and urban morphology studies (Ahmad et al., 2017; Gobakis et al., 2011; Saha et al., 2022; Tariq et al., 2023). They can also be utilized in hydrological studies to determine ground water vulnerability, identify ground water potential zones, flood estimation, soil erosion and sediment yield, soil moisture content, etc. (Avand & Moradi, 2021; Mawasha & Britz, 2020; Tehrany et al., 2013).

On the other hand, there are several challenges in the accurate assessment of erosion due to combination of several factors including variations in the LULC, environment, anthropogenic activities, and terrain. The loss of nutrient-rich soil in catchments due to water erosion is considered a significant natural disaster, especially in areas that are prone to excessive rainfall across the world. Generally, assessing soil erosion can be done through physical, empirical or numerical and conceptual approach. In physical approach mostly, assessing soil erosion can be done in laboratory scale with controlled condition through rainfall simulators (Shojaei et al., 2020). Some limitations in physical approach, it requires large datasets for simulation. In empirical approach, it reduces the necessity of huge dataset and assessed soil erosion with limited datasets (Merritt et al., 2003). Universal Soil Loss Equation (USLE) is one of numerical approach that is generally adopted for evaluating and forecasting sheet surface erosion. It was later revised by Renard (1997) and named RUSLE to address the shortcomings of the USLE model. In RUSLE model, it majorly includes five factors such as erosivity (R factor), erodibility (K factor), topography (LS factor), land management (C factor) and conservation practice (P factor) can be determined with limited dataset. According to Nasiri et al. (2017), this RS and GIS platform enables a quicker and more cost-effective assessment of erosion and sediment yield in the study area compared to other erosion models such as Erosion Potential Model (EPM), Water Erosion Prediction Project (WEPP), and the Food and Agriculture Organization (FAO). In land management and conservation practice factors, determined through the LULC within the study area. Weifeng and Bingfang (2008) developed a remodified model using a complicated dataset and subsequently calculated the yield of soil as per the hydrological parameter entity. Furthermore, the basin is divided into SWs level for enhancing the comprehension of the features and patterns of watershed affected by precipitation induced flooding and erosion in upcoming decades, can assist in devising more effective management strategies for the area (Eslami et al., 2017).

From the extensive literature survey, many researches have been done on prediction of LULC using CA–ANN model (Alam et al., 2023; Değermenci, 2023; Fetene et al., 2023; Halder et al., 2023; Kumar & Agrawal, 2023; Rahman et al., 2017). However, the effectiveness of the models in SE and SY studies has not been thoroughly investigated in an ungauged basin with limited dataset. Also, many researchers explored on the impact of climate changes and integrated LULC and climate changes in hydrological studies (Ahmad et al., 2023; Behera et al., 2020; Chakrabortty et al., 2020; Mondal et al., 2015; Nasidi et al., 2021; Rajbanshi & Bhattacharya, 2021; Senanayake & Pradhan, 2022; Singh & Saravanan, 2020) whereas the studies on impact of LULC changes alone are limited (Mirakhorlo & Rahimzadegan, 2021; Rizeei et al., 2016). Therefore, the research study is carried out in Ponnaniyar river basin, an ungauged basin and one of tributaries in the Cauvery basin, India. The main purpose of this study is to assess erosion rate and yield for future in an ungauged basin by utilizing the projected LULC map of the study area. Additionally, Landsat 8 satellite dataset was exclusively used for the classification and prediction of LULC to eliminate the variation between the resolution, bands and its wavelength across different satellites datasets. The main objectives of the current study are: (i) To classify and project the LULC map of the study area using RTC and CA-ANN; (ii) To assess and predict the soil erosion and sediment yield using projected LULC map; (iii) To prioritize and integrate the sub-watersheds for different years using grade average method. A reliable forecast of potential soil erosion and sediment yield will be of significant help to policymakers working towards preventing land degradation, promoting soil conservation, achieving sustainable environmental planning and development.

Materials and methods

Study area

The Cauvery delta forms the northern and eastern boundaries of the Ponnaniyar river basin, an ungauged basin in Tiruchirappalli District (a smart city), Tamil Nadu, India while the Perumal malai, Kallupatti, Pillayarnatham and Karuppur hills form the western boundary. The South Vellar river basin forms the southern boundary of study area. The river basin lies in the extent between longitudes 78°00′00″ and 78°50′00 ″ East and latitudes 10°10′00″ and 10°50′00″ East (Fig. 1). This river originates from the isolated forest stream in Perumal malai hills of Kadavur taluk in Karur district, it flows 52 km to the east via Manapparai taluk in Tiruchirappalli district and eventually combines with Koraiyar river and flows into the Cauvery river. The first 16 km is referred to as Ponnaniyar river basin, the second 16 km is termed as the Mamundiyar river and the last 20 km is called the Ariyar river. The dam for this river basin is situated on the side of the Ponnaniyar River, within the rural area of Mugavnur in the Manapparai taluk. Between the hills of Perumal malai and Semmalai, the Ponnaniyar dam was situated. Ponnaniyar river does not have a constant flow of water (it is a non-perennial river). The morphology of the region consists of both terra rossa and regur soil or black soil. Rice, groundnuts, and vegetables are cultivated in the top portion of basin; rice, pulses and coconut are cultivated in the centre while at the bottom of the basin, only rice is cultivated.

Fig. 1
figure 1

Map of Ponnaniyar river basin (Source: https://www.iamwarm.gov.in/)

Data acquisition

The Ponnaniyar river basin tile is analyzed using Landsat 8 (OLI/TIRS) spacecraft data acquired from US Geological Survey (USGS) Earth Explorer for the period between 2014 and 2022 (Table 1). After correcting the irradiance and albedo of the dataset, it is projected to UTM WGS 1984 zone 44N. ArcGIS 10.4.1 software is used to sketch the catchment boundary of the river basin, which is used for extensive investigation. The variable factors such as slope, elevation, aspect, and drainage density are used to predict the LULC map of the years 2022, 2026 and 2030. These factors extracted by Cartosat DEM with resolution 30 m × 30 m are acquired from Bhuvan National Remote Sensing Centre (NRSC), India. The annual precipitation data in the form of gridded dataset is downloaded from Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) data hub from the year 1981 to 2022 with a resolution of 5 km. Soil map of the study area is obtained from FAO data hub in the scale of 1:50,000 along with percentage of type of soil content and is used to classify the soil texture for analysis of erosion and yield. RTC technique is adopted to spatially distinguish between the LULC categories. The methodology for prediction of SE and SY of Ponnaniyar river basin is illustrated in the form of a flow chart, as shown in Fig. 2.

Table 1 Details of LULC dataset
Fig. 2
figure 2

Methodology for assessing SE and SY

Mapping of LULC

The increased usage of RTC is being observed in recent times, since this method is more accurate and does not have any computational limitation in terms of training samples unlike other classifiers. The fundamental idea behind classifier ensembles is that several classifiers may outperform a single classifier. Random Trees, developed by Breiman (2001), is an optimistic classifier with many uses in remote sensing (Rodriguez-Galiano et al., 2012). There are two stages involved in the function of random trees. In the first stage, the training data for each tree in the group is independent and similar in size, since each tree is generated by randomly picking the data. In the second stage, the dividing restriction for each node of tree is derived by comparison of predictor variables. As a result, the decision tree is constructed using a narrow set of features to reach its maximum depth. According to studies, the effectiveness of tree-based approaches depends more on the option of clipping strategies than on the option of feature selection processes. So, another advantage is that the RTC propagation is not restricted in any way. In the present study, RTC is executed using GIS software developed with a sample size of 1000 per category and tree size of 50, for the Ponnaniyar river basin.

Assessment of accuracy of LULC classification

Accuracy of maps is essential for obtaining precise data. Erroneous categorization of LULC can be detected when compared with satellite imagery, it will lead to failure in meeting the accuracy requirement (Rwanga & Ndambuki, 2017). The effectiveness of the dataset is assessed by validating the classified imagery using error matrices and then incorporating classification and accuracy assessment. Evaluation points and categorized dataset are used to produce the classification error matrix. To quantify and compare the reliability of imagery classifications, the kappa coefficient is recommended. Each categorization is examined using the confusion matrix to determine the OA, K, UA and PA for the study area dataset by following Eqs. (1), (2), (3) and (4).

$$Overall\ Accuracy\ (OA)=\frac{Total\ Number\ of\ Corrected\ Classified\ Pixels\ (Diagonal)\ }{Total\ Number\ of\ References}\ast 100$$
(1)
$$Kappa\ Coefficient\ (K)=\frac{\left( TS\ast TCS\right)-\sum Column\ Total\ast Row\ Total}{TS^2-\sum Column\ Total\ast Row\ Total}$$
(2)
$$User\ Accuracy\ (UA)=\frac{Number\ of\ correctly\ classified\ Pixels\ in\ each\ category}{Total\ Number\ of\ Classified\ Pixels\ in\ that\ category\ \left( Row\ Total\right)}\ast 100$$
(3)
$$Producer\ Accuracy\ (PA)=\frac{Number\ of\ correctly\ classified\ pixels\ in\ each\ category}{Total\ Number\ of\ Classified\ Pixels\ in\ that\ category\ \left( Column\ Total\right)}\ast 100$$
(4)

Cellular automata — artificial neural network

To analyze LULC changes in the selected study region, the CA model is integrated with ANN and used to obtain recent LULC images in contrast to a large amount of historical data. For prediction of SE and SY in the future, projected LULC maps are used. Using the normal procedure, it is quite challenging to track LULC changes and quantify them. ANN uses a model inspired by neurons in the human brain, which respond to input based on previous training (Kafy et al., 2022). The programmed neurons aid in assessing the analogies or connections between information and trends for specific users under certain circumstances. Markovian-based CA can be integrated with neural network to improve the CA-ANN model’s prediction accuracy as well as efficiency. QGIS, an open source software with MOLUSCE is incorporated for analysis and simulation in the current study. The MOLUSCE plugin necessitates gridded inputs for training and calibrating a CA-ANN model. These inputs should encompass raster data, specifically the LULC maps for 2014 and 2018, as well as data on elevation, slope, drainage density, aspect, and distances from roads and streams. On completion of CA simulation, validation is done to evaluate the reliability of the models. To ensure accuracy, MOLUSCE controls information entry throughout the duration of the previous year and verifies results via a comparison between classified and predicted maps. This analysis provides validity to claims of pixels recognition and variations (Kamaraj & Rangarajan, 2022). Based on these simulations, a consistent index ranging from zero to one is created to characterize the landscape. Kappa analysis (K) is also conducted which confirms the reliability of the model.

Soil erosion

In this analysis, RUSLE method is used according to the predefined set of regulations. For modelling, 30 m resolution gridded dataset, i.e. raster data is created from all factors. Appropriate precipitation datasets are used to determine the erosivity factor (R) and the soil texture map reflecting the regional soil characteristics is used to establish the soil erodibility parameter (K). A DEM is used to generate the LS factor, also referred to as topographic factor. The LULC spatial map derived from Landsat 8 satellite imagery is used to calculate the C and P factor. The yearly soil erosion is calculated by multiplying all the RUSLE parameters, as given in Eq. (5).

$$SE=R\ast K\ast LS\ast C\ast P$$
(5)

where, SE indicates the yearly soil erosion in tons per hectare per year.

Rainfall erosivity factor

The erosivity of precipitation on soil surface is the measure used to characterize the degree to which a given amount of rain might cause soil loss. The severity and volume of precipitation have a direct impact on the enhancement of the erosivity factor. For the study, the precipitation dataset is acquired from the CHIRPS data hub. The precipitation dataset is analyzed from the year 1981 to 2022. The erosivity factor is calculated using 42 years (1981–2022) annual precipitation data of the study area. A seamless surface precipitation map is generated by the point data using the Inverse Distance Weightage (IDW) technique an interpolation method (Rizeei et al., 2016). From Singh et al. (1992), erosivity factor under Indian condition is generated by using the following Eq. (6).

$$R=79+0.363\ast {R}_N$$
(6)

where, RN is the long-term average yearly precipitation in millimeter (mm).

Soil erodibility factor

Long-term soil degradation due to localized precipitation or surface runoff is referred to as erodibility. In this research, FAO’s (Food and Agriculture Organization) soil series map of Ponnaniyar river basin is used. The soil map is fundamental for calculating the value of the K factor to quantify soil erodibility. The type of soil texture is also used to determine the value of the K factor. The spatial analyst tool in GIS software is used to transform the vector soil map into raster representation of the K value. In the result, the soil layer is reassigned to more suitable categories. From Wischmeier & Smith (1978), the following Eqs. (7) and (8) are used to obtain the K value of the RUSLE model.

$$K=\frac{1}{759.4}\ \left\{2.1\ast {10}^{-4}\ \left(12- OM\right)\ast {M}^{1.14}+3.25\ \left(s-2\right)+2.5\ \left(p-3\right)\right\}$$
(7)

where, M represents the soil particle size in mm.

$$M=\left(\% of\ silt+\% of\ very\ fine\ sand\right)\ast 100-\% of\ clay$$
(8)

where, OM is the organic matter of soil, s and p are the structure code and permeability code of the soil, respectively.

Topography factor

Topographic maps can be prepared from geospatial data, these maps contain accurate details of land surfaces which are important for building projects, prevention of landslides, urban planning, resource management, and other similar activities. GIS aids in carrying out a wide variety of ecological simulations (Dunn & Hickey, 1998). The inclusion of both depositional areas and convergent flow pathways in RUSLE allows for the use of total uphill distance to rectify any inaccuracies in the calculation of slope distance at each pixel. Moreover, this topography factor, i.e. LS factor is used in RUSLE model to quantify sediment flow rate (Moore & Wilson, 1992). Based on Moore & Burch, 1986, the LS factor is calculated by using Eq. (9) given below.

$$LS={\left(\frac{Slope\ Length}{12.5}\right)}^{0.4}\ast {\left(\frac{0.0174\ast Sin\uptheta}{0.0896}\right)}^{1.4}\ast 1.4$$
(9)

where, LS denotes Length and steepness factor.

Land management factor

Soil loss is calculated by analyzing the temporal changes of LULC. The land management factor is used to regulate development of land resources for various purposes and to minimize degradation of land. Based on William and Smith, the C factor value of the LULC of the Ponnaniyar river basin is designated based on the geographical conditions. When there is no cultivation of crops, the bare land is considered as fallow land, the C value is then taken to be value one (Table 2). Conversely, higher coverage crops are assigned a lower C value. The C value is assigned for each LULC category. By using spatial analysis in ArcGIS software, the C maps of the years 2014, 2018, 2022 (actual and predicted), 2026 and 2030 are developed.

Table 2 C factor values

Conservation practice factor

To mitigate the effects of soil erosion when considering tillage on both uphill and downhill slopes, a particular support practice factor is specified as a land loss ratio. Table 3 presents the conservation practice value for LULC spatial maps of the years 2014, 2018, 2022 (actual and projected), 2026 and 2030. These P factor maps are made with the help of spatial modelling for the selected perio

d.

Table 3 P factor values (Source: Wischmeier & Smith, 1978)

Sediment yield

According to Zarris et al. (2011), precise estimation of sediment yield at different temporal and geographical scales is an essential requirement for tackling land degradation, implementing irrigation facilities, and carrying out hydropower generation projects. However, along the erosion and sedimentation route, only a fraction of the eroded granules ultimately make it to the basin outflow, which could lead to channels or tanks. Hence, calculating the SDR is a crucial initial step in estimating SY. Several studies considered by developing and using a wide variety of SDR models from the basic, i.e., empirical to the most complicated, i.e., physically based. According to these analyses, the quantity of SY delivered into the outlet of the basin is mostly determined by catchment parameters such as surface quality, type of soil, basin gradient, aspect, morphology, basin extent and gradient of channel. According to Vanoni (1975), the SDR can be calculated using the equation given below. The resultant SDR value is numeric, falling in the range between 0 and 1. Most of the eroded particles are carried by water to regions downstream where it is not deposited in substantial amounts. The value of SDR increases when the dimension of catchment decreases and vice versa. By using Eqs. (10) and (11), the SDR value and the SY rate within the study area are determined.

$$\textrm{SDR}=0.42\ast {\textrm{A}}^{-0.125}$$
(10)

where, A denotes area of basin in sq. km, and SDR denotes sediment delivery ratio.

$$\textrm{SY}=\textrm{SDR}\ast \textrm{SE}$$
(11)

where, SY denotes sediment yield in tons per hectare per year.

Validation of SE map

While developing a model to make predictions, the validation process is crucial since it establishes confidence in the veracity and reliability of the model used for the research. Validation usually involves a comparison between the predicted results and an undetermined outcome as the area might be impacted by future soil erosion. From the literature study, 50 degradation points, i.e., True Positive (TP) and 50 quasi points, i.e., True Negative (TN) are generated in ArcMap platform based on ground observation and Google Earth Pro software (Bag et al., 2022). The spatial data modeler, ArcSDM, an add-on toolbox of ArcMap, is used to evaluate the erosion map based on classified and predicted LULC maps of the year 2022. The ROC curve illustrates the accuracy rates against several different thresholds rates. Area Under Curve (AUC) serves as a worldwide accuracy indicator for an approach and it is independent of any specific distinguishing criteria (Pascale et al., 2013). According to Zweig and Campbell (1993), if the AUC ranges between 0.5 and 1, it indicates an acceptable match, whereas an AUC lower than 0.5 represents a rare opportunity fit. Using ROC/AUC, the accuracy rate of the generated soil erosion map is evaluated, based on present and projected pattern of LULC.

Prioritization and integration of sub-watersheds

Soil erosion risk maps provide a clear indication of certain sub-watersheds that require urgent soil and water conservation measures, particularly those falling within the high and very high severity classes. These measures are essential to ensure the sustainable utilization of land resources. Consequently, the process of prioritizing these sub-watersheds entails ranking them based on the severity of soil erosion issues, with the goal of determining the most critical areas that should be targeted for conservation techniques. This prioritization is essential because it may not always be technically or financially feasible to protect every landscape affected by soil erosion. In such cases, identifying priority areas for intervention becomes imperative (Shekar & Mathew, 2022; Tilahun et al., 2023). Additionally, the grade average technique is employed to consolidate the rankings obtained through RUSLE model from years 2014 to 2030. The final rating for each sub-watershed was determined by calculating the arithmetic mean of the ranks obtained through these various approaches. Consequently, a sub-watershed is given top priority if its average ranking is lower (Şahin, 2021).

Results

Result of LULC classification

The LULC is primarily classified into five categories, namely water bodies, settlement areas, forest cover, barren lands, and agricultural fields. Under the class of water bodies, streams, ponds, lakes and open water bodies are included. The agricultural land class comprises of crops, plantations, and fallow land. The settlement class consists of residential areas, commercial areas, power plants, industrial zones, roadways, and bridges. Sand accumulation along the riverbanks, open land, dry lakes, or ponds without water all fall under the barren class. The forest cover category includes coniferous, deciduous, and reserve forests (Abijith & Saravanan, 2022; Sathiyamurthi et al., 2023). Table 4 illustrates the area and the percentage of each class for past, present and projected LULC. According to Table 4, the water bodies classification highest cover found in the projected year 2026 with 0.64% of 11.54 sq. km while the lowest found in the year 2014 with 0.41% of 7.45 sq. km. In agriculture classification, the highest cover found in the year 2014 with 65.99% of 1190.15 sq. km whereas lowest found in the year 2018 with 52.46% of 946.18 sq. km. In settlement classification, the maximum cover found in the year 2030 (projected) with 9.98% of 180.15 sq. km while the minimum cover found in the year 2014 with 4.83% of 87.15 sq. km. In forest classification, the highest cover found in the year 2022 (actual) with 8.79% of 158.49 sq. km while the lowest cover found in the year 2014 with 7.23% of 130.36 sq. km. Finally, the barren land classification maximum cover found in the year 2018 with 33.89% of 611.26 sq. km whereas the minimum cover found in the year 2026 with 17.47% of 315.01 sq.km. The classified and projected LULC map within the study area is shown in the Fig. 3.

Table 4 Area and its percentage of each categories in 2014–2030
Fig. 3
figure 3

LULC map of Ponnaniyar river basin (a) 2014, (b) 2018, (c) 2022 classified, (d) 2022 projected, (e) 2026, (f) 2030

Result of accuracy assessment

The estimates of PA indicate the correctness of a particular class, while the estimates of UA indicate the user’s confidence in the classification. The UA and PA for all 3 years (2014, 2018, and 2022) are shown in Table 5. UA plays a crucial role in determining the categories. The accuracy of both UA and PA is 100% for the forest cover class in 2014, while in 2022, UA is 90%, and PA is 100% right. The forest class demonstrates a higher degree of precision compared to other classes. UA and PA show varying levels of accuracy in the categories of agricultural land and barren land. The OA and K determine the accuracy level of the classified LULC map. According to Monserud and Leemans (1992), the kappa coefficient categorizes the agreement into five distinct levels: unacceptable (less than 0.4), acceptable (0.4–0.55), good (0.55–0.70), very good (0.70–0.85), and excellent (more than 0.85). From Table 6, the kappa coefficient of the LULC map for the years 2014, 2018, and 2022 is found to be 0.70, 0.86, and 0.83, respectively, indicating good, excellent, and very good kappa agreements, respectively. Additionally, the OA for the years 2014, 2018, and 2022 is found to be at a good accuracy level with percentages of 77%, 89%, and 86%, respectively.

Table 5 User accuracy and producer accuracy in 2014–2022
Table 6 Overall Accuracy (OA) and Kappa Coefficient (K) in 2014–2022

Result of projected LULC

Integrating Cellular automata with ANN produces excellent results when simulating LULC changes. The CA-ANN model can be used to predict future LULC changes, projecting the change in the LULC for the required period and defining the land shifts. In this analysis, transitions that have a substantial effect on LULC shifts are considered, including factors such as drainage density, elevation, slope, and aspect, which constrain the transition driving factor. By using these driving factors along with prior (2014) and subsequent (2018) land use/land cover imagery, a LULC map of 2022 is predicted (Fig. 4). The ANN used to train and validate the driving factors to project the LULC with momentum rate of 0.08 and 1000 iteration as shown in Fig. 5. To assess the accuracy of the QGIS MOLUSCE plugin’s projected spatial map against the actual LULC map of 2022, a validation function is used (Fig. 6). Pearson correlation is a statistical method that quantifies the degree of linear correlation between two driving factors. According to Table 7, distance from settlement and DEM has a higher correlation of 0.819 among all driving factors while the lower correlation found between slope and distance from water bodies of 0.412. The transition probability matrix is included in the LULC change simulation to generate the predicted map for 2022 (Table 8). The reliability of the predicted LULC map for 2022 is established by comparing it to the actual LULC map of 2022 using the validation toolbox in the MOLUSCE plugin (Table 9). The findings are presented in Table 10, it is noted that agricultural lands and barren land cover have higher prediction accuracies of 98.71% and 96.01%, respectively. Settlement and forest have slightly high accuracy with 86.71% and 87.99%. However, water bodies has a lower accuracy of 80.10% compared to other classes. Despite the lower accuracy for water bodies, the overall kappa found to be 0.74 of the predicted 2022 map satisfies the kappa agreement level, and the overall correctness with 85.35%. Table 11 illustrates the LULC change detection for every 4 years and it is observed that the settlement within the study area gradually increases whereas the water bodies and agriculture cover gradually decrease.

Fig. 4
figure 4

Driving Factors for projection of LULC map (a) DEM, (b) Slope, (c) Distance from water bodies, (d) Distance from road, (e) Distance from Settlement

Fig. 5
figure 5

ANN learning curve of Training and Validation

Fig. 6
figure 6

Comparison graph between classified and projected LULC map in the year 2022

Table 7 Pearson Correlation between driving factors
Table 8 LULC transition matrix between 2014 and 2018
Table 9 Kappa statistics of Predicted LULC in 2022
Table 10 Statistics of change in area and its accuracy of each classes in 2022
Table 11 LULC change detection for every four years between 2014 and 2030

Result of R factor

Scientists at the USGS and Climate Hazards Center developed CHIRPS, a precipitation dataset spanning over 42 years. It utilizes the in-house climatology, CHPclim, and onsite station data to generate gridded rainfall time series for trend analysis and drought monitoring. It encompasses data from the year 1981 to the present and covers a quasi-global area from 50°S to 50°N along all longitudes (Funk et al., 2015; Reddy & Saravanan, 2023). The upgraded version, CHIRPS 2.0 data, is accessible for use with a resolution of 5 km and temporal options of daily, 5-day, monthly, and yearly intervals. For the period from 1981 to 2022, CHIRPS V2.0 data is used at yearly time intervals. The annual average precipitation in the river basin lies between 883.74 and 1323.86 mm (Fig. 7a). The highest precipitation is found at the bottom of the study area, while the northwest portion of the study area receives the lowest precipitation. The annual mean precipitation in the 42-year period from year 1981 to 2022 for the Ponnaniyar river basin is found to be 1047.87 mm. The minimum erosivity of the river basin is found to be 399.80 MJ mm per hectare per year, while the maximum erosivity is seen to be 559.56 MJ mm per hectare per year (Fig. 7b). The annual mean erosivity of the study area is noted to be 459.38 MJ mm per hectare per year.

Fig. 7
figure 7

Erosivity factor of study area (a) Average Annual Rainfall, (b) R Factor

Result of K factor

In this study area, the major soil types are sandy clay loam, accounting for 63.49% of the area, sandy loam with 3.21%, loamy sand with 12%, and clay with 21.30% of the total area, respectively (Fig. 8). From Fig. 8, it is perceived that the soil erodibility in the study area ranges from 0.052881 to 0.094380 tons.ha.h/ha/MJ/mm. Table 12 shows that the highest soil erodibility is found in loamy sand with 0.094380 tons.ha.h/ha/MJ/mm, while the lowest is found in clay soil with 0.052881 tons.ha.h/ha/MJ/mm. The low percentage of clay and silt content in loamy sand soil contributes to its less plasticity, making it more susceptible to erosion even with little rainfall. This soil type is predominantly found in the northeastern part of the Ponnaniyar river basin. Changes in the physical properties of soil such as texture, composition, porosity, density, etc. occur over various decades. The soil erodibility map of the Ponnaniyar river basin is shown in Fig. 8.

Fig. 8
figure 8

Erodibility factor of Ponnaniyar river basin

Table 12 Soil erodibility (Source: FAO)

Result of LS factor

The elevation of the Ponnaniyar river basin ranges from -42 to 678 m, while the gradient varies from 0 to 81.04. According to Mondal et al. (2015), the slope of the study area is categorized into five levels, namely slight (less than 2), low (2–5), moderate (5–10), steep (10–20), and very steep (greater than 20). The erosion analysis is carried out for these slope categories (Fig. 9). Based on Cartosat – 1 elevation data dated 20.08.2011, the topography factor of the Ponnaniyar river basin is determined and is found to range from 0 to 47.73. The average LS and standard deviation of the river basin are found to be 0.281 and 1.377, respectively. The calculated topography factor is further classified into five groups: below 5, 5–10, 10–15, 15–20, and above 20. Comparatively, low elevation in the study area indicates that most of the plains and valleys have extremely low LS factors, as shown in Fig. 9.

Fig. 9
figure 9

Topography factor of Ponnaniyar river basin

Result of C factor

The values under this criterion span from 0 to 1 in the Ponnaniyar river basin, where the value ‘0’ represents the water bodies cover, and the value ‘1’ represents the barren land class. In this research, a machine learning-based Random Trees Classifier is utilized to classify the LULC in the years 2014 and 2018, according to the primary classification from an earlier study conducted by Abijith et al. (2020) and Sathiyamurthi et al. (2023), and projected to the years 2026 and 2030, respectively. The LULC in the year 2022 is classified and validated with the projected 2022 LULC map. The classified and projected LULC map is then applied to generate the C factor map, as illustrated in Fig. 10. The classes, such as dense forest cover, built-up areas (settlement), cultivable land (agricultural fields), water bodies, and barren land/open land, are found to have respective designated values of 0.05, 0.003, 0.6, 0, and 1.

Fig. 10
figure 10

C Factor map of Ponnaniyar river basin (a) 2014, (b) 2018, (c) 2022 classified, (d) 2022 predicted, (e) 2026, (f) 2030

Result of P factor

Many soil and water conservation techniques, including check dams, nala bunds, etc., already exist in the Ponnaniyar river basin; however, they have not been maintained in a proper manner. Furthermore, no protected zones are displayed on the map along the stream network. In this research, the P value from the study by Wischmeier and Smith (1978) is used, as stated in Table 4, due to the lack of a map of existing water harvesting structures. According to the P factor scale, the value ranges from 0.1 to 1, indicating different types of land use, with the highest value 1 assigned to both cultivable land (i.e., agricultural land with a gradient greater than 100%) and non-cultivable land (settlements, water bodies, forests, and barren land). This support practice factor is significantly influenced by the physical form of the study area and land use. Based on the combination of gradient and land use maps, the actual P factor is generated for the years 2014, 2018, 2022 (actual and predicted), 2026, and 2030 (Fig. 11).

Fig. 11
figure 11

P Factor map of Ponnaniyar river basin (a) 2014, (b) 2018, (c) 2022 classified, (d) 2022 projected, (e) 2026, (f) 2030

Result of soil erosion

The RUSLE model is used to estimate annual soil erosion of the Ponnaniyar river basin, which ranges between 0–1684.22 tons/hectare/year in 2014, 0–1752.20 tons/hectare/year in 2018, 0–1703.69 tons/hectare/year in 2022 (Actual), 0–1747.88 tons/hectare/year in 2022 (Projected), 0–1756.17 tons/hectare/year in 2026 and 0–1763.20 tons/hectare/year in 2030 (Table 13). The mean annual soil loss for the entire catchment is found to be 27.24, 45.64, 30.10, 3309, 24.79 and 31.87 tons/hectare/year in the years 2014, 2018, 2022 (actual), 2022 (projected), 2026, and 2030 (Table 13). Soil erosion is classified into several categories: very low (less than 1 ton/ha/year), low (1–5 tons/ha/year), moderate (5–10 tons/ha/year), high (10–20 tons/ha/year), very high (20–50 tons/ha/year), and extreme (greater than 50 tons/ha/year) (Dabral et al., 2008). In the Ponnaniyar river basin, soil erosion is mostly found to be very low category, i.e., SE less than 1 ton/ha/yr in the years 2014 (75.77%), 2018 (72.45%), and 2022 (76.80 and 73.81) with respect to the projected soil erosion of 2026 (79.02%) and 2030 (77.29%) shown in Fig. 12 and Table 14.

Table 13 Basic Statistics of SE in 2014–2030
Fig. 12
figure 12

SE map of Ponnaniyar river basin (a) 2014, (b) 2018, (c) 2022 actual, (d) 2022 projected, (e) 2026, (f) 2030

Table 14 Area and its percentage of SE in 2014–2030

Result of sediment yield

The RUSLE–SDR model is used to obtain the estimated yearly SY of the Ponnaniyar river basin, which ranges between 0–277.92 tons/hectare/year in 2014, 0–289.11 tons/hectare/year in 2018, 0–285.72 tons/hectare/year in 2022 (Actual), 0–287.66 tons/hectare/year in 2022 (Predicted), 0–280.57 tons/hectare/year in 2026, and 0–286.97 tons/hectare/year in 2030 (Table 15). The mean annual sediment yield of the entire catchment is found to be 4.49, 7.53, 4.97, 5.45, 4.09, 5.25 tons/hectare/year in the years 2014, 2018, 2022 (Actual), 2022 (Predicted), 2026, and 2030, respectively shown in Table 15. The sediment yield (SY) potential is divided into five categories, namely slight (less than 1 t/ha/yr), moderate (1 to 7 tons/ha/yr), strong (7 to 19 tons/ha/yr), severe (19 to 40 tons/ha/yr), and very severe (greater than 40 tons/ha/yr) (Kolli et al., 2021). In the Ponnaniyar river basin, there is a lesser sediment yield, i.e., SY less than 1 ton/ha/yr in the years 2014 (90.35%), 2018 (85.28%), 2022 (89.82% and 87.53%), 2026 (91.80%), and 2030 (89.69%), as shown in Fig. 13 and Table 16.

Table 15 Basic Statistics of SY in 2014–2030
Fig. 13
figure 13

SY map of Ponnaniyar river basin (a) 2014, (b) 2018, (c) 2022 actual, (d) 2022 projected, (e) 2026, (f) 2030

Table 16 Area and its percentage of SY in 2014–2030

Result of ROC curve

The ROC is utilized as an illustration of the model’s performance, as it provides an assessment of the model’s overall reliability. The AUC is used as an indicator for the reliability of the model’s outcomes (Abijith et al., 2020). According to Mandrekar (2010), Shekar and Mathew (2023a, b), the accuracy level of AUC is categorized into five categories: excellent (0.9–1.0), very good (0.8–0.9), good (0.7–0.8), acceptable (0.6–0.7), and unacceptable (0.5–0.6). The generated soil erosion map, based on both the classified and predicted LULC maps, provides AUC values of 0.697 and 0.686, respectively, falling under the acceptable level. This finding reveals that both the classified and predicted LULC maps are useful for developing the erosion map and identifying the risk for the future scenario. Additionally, the predicted erosion map is nearly the same as the actual erosion map of the study area. Fig. 14 represents the ROC plot of the soil erosion map based on the classified and predicted LULC maps Fig. 15.

Fig. 14
figure 14

Validation of SE map based on classified and projected LULC in 2022

Fig. 15
figure 15

Priority of SWs in Ponnaniyar river basin (a) 2014, (b) 2018, (c) 2022 actual, (d) 2022 projected, (e) 2026, (f) 2030

Result of prioritization and integration of sub-watersheds

The priority of SWs from 2014 to 2030 within the study area made according to mean annual soil erosion is shown in Table 17 and Fig. 16. These results serve as valuable tools for the development of a watershed-level land use plan aimed at effectively managing the entire watershed. The grade mean value of different year based on mean soil loss is shown in Table 18. The grade mean value of each sub-watersheds within the study area was categorized into four priority levels: slight, medium, high, and extreme. Among these, SW 11 found to be the highest priority, classified as extreme level with an average grade value of 1.50 followed by SW 9, SW 12, SW 1 with values of 1.83, 3.33,4.67 and it covers 27.27% within study area. On the other hand, SW 10 falls in high priority level with a grade mean value of 9.28 followed by SW 2 and SW 7 with value of 5.67 and 6.67 and it covers 32.05% of study area. SW 6, SW 13 and SW 8 falls in medium priority level with grade mean values of 8.00, 8.17 and 10.33 which covers 9.58% of study area while the SW 3, SW 4 and SW 5 falls in Low priority level with grade mean value of 12.33, 10.83, 12.5 and its cover 31.10% of study area. A detailed breakdown of sub-watershed priorities can be found in Table 19, and a visual representation of these priorities is provided in Fig. 16.

Table 17 Priority of SWs based on mean soil loss (SE) in 2014 – 2030
Fig. 16
figure 16

Final Priority and its level of SWs

Table 18 Grade Average values for each SWs
Table 19 Priority Level of SWs

Discussion

Landsat 8 (OLI/TIRS) satellite dataset was exclusively selected for LULC classification and prediction within the Ponnaniyar river basin, due to variation between the bands, its wavelength and resolution in Landsat 5 (TM) and Landsat 7 (ETM+) dataset. Furthermore, Landsat 8 allows for LULC classification up to level 3, utilizing different band combinations within the dataset (Sun & Schulz, 2015). LULC classification was performed by a recently developed RTC machine learning classifier and the accuracy was assessed by a statistical method of OA, K, UA and PA. The average accuracy of LULC classification from 2014 to 2022 was found to be 84% of OA and 0.80 of K. On comparing with previous studies in machine learning–based LULC classification (Aliabad et al., 2023; Arabi Aliabad et al., 2019; Forozan et al., 2020; Kulithalai Shiyam Sundar & Deka, 2022; Talukdar et al., 2020; Theres & Selvakumar, 2022), it is evident that this study achieved results with similar values of OA and K agreement level and it accomplished this with a minimal number of training samples. Furthermore, it is observed that from 2014 to 2030 (projected), there is a slight increase from 7.45 sq. km to 11.20 sq. km in the classification of water bodies, while forest cover increases from 130.35 to 146.38 sq. km from 2014 to the projected 2030. There is a drastic rise in the settlement classification, which goes from 87.14 to 180.05 sq. km. There is also a gradual increase in the extent of barren land, from 388.52 to 413.61 sq. km. Consequently, there is a decrease in agricultural cover from 1190.14 to 1052.41 sq. km from 2014 to the projected 2030. When comparing the classified and predicted 2022 LULC maps, it is observed that almost all classes have similar percentages. The classified barren land cover of 2022 is marginally lower than the projected LULC for 2022 (Fig. 17). The findings indicate that agricultural cover and barren lands are inversely proportional to each other. With an increase in barren land, there is a decrease in agricultural fields. This trend is attributed to economic development, urban expansion, and climate change, which are leading to the conversion of agricultural lands into barren land and settlement areas.

Fig. 17
figure 17

Area coverage of LULC between 2014 and 2030

Previous studies in the prediction of LULC with CA-ANN and CA-Markov, showed that this study’s findings reveal an improvement in prediction accuracy of OA and K with 85% and 0.74, respectively, compared to Kamaraj and Rangarajan (2022) with 76.28% of OA and 0.69 of K, Rizeei et al. (2016) with 69.83% of OA and 0.66 of K, Paul and Bhattacharji (2023) with 84.71% of OA and 0.78 of K, Sajan et al. (2022) with 78.22% of OA and 0.93 of K, Akdeniz et al. (2023) with 80% of OA and 0.75 of K, Shahbazian et al. (2019) with 84.25% of OA, Zadbagher et al. (2018) with 77% of OA, Joorabian Shooshtari and Aazami (2023) with 72.9%. Furthermore, this study improved the accuracy in water bodies and barren land classification with 80.10% and 96.01%, respectively, compared to Kulithalai Shiyam Sundar & Deka, 2022 (40% of Barren Land) and Weslati et al., 2023 (18.05% of Water Bodies). In addition, it is observed that the driving factors play a significant role in prediction of LULC. This study employed the same set of driving factors (distance to settlement, distance to road, DEM, slope, aspect) as outlined in Kamaraj and Rangarajan (2022), with the exception of aspect factor is replaced with distance to water bodies. According to Table 7, distance to water bodies shows a positive Pearson correlation among these driving factors. This study also experimented with aspect factor which resulted in negative Pearson correlation among driving factors.

The main purpose of this study is to assess the erosion and yield for the future in an ungauged basin by utilizing the projected Land Use/Land Cover (LULC) map of the study area. Following the approach of Rizeei et al. (2016), the R factor was determined based on 42 years of average annual rainfall and applied in the prediction of SE and SY for the years 2022, 2026, and 2030 using the RUSLE model. According to Mondal et al. (2015), the LS factor and K factor are changeable, but their shifts take decades to be considered. The highly accurately projected LULC maps (2022, 2026, and 2030) were applied in the C and P factors of the RUSLE model to determine soil erosion and sediment yield for the future. The results reveal that there is an increase in mean soil erosion from 27.24 to 31.87 tons/ha/yr from 2014 to 2030. Additionally, the mean sediment yield increased from 3.95 to 4.55 tons/ha/yr during the same period. This increase is attributed to the expansion of barren land and settlement, which consequently increases the rate of erosion and yield. The generated erosion map was validated using the ROC curve to determine the accuracy of the erosion prediction. The erosion map based on the projected LULC for 2022 showed good agreement with an ROC curve value of 0.686. Furthermore, the estimated mean sediment yield based on classified and projected LULC was found to be almost similar, with values of 4.97 and 5.45 tons/ha/yr, respectively. From Table 11, the determined standard deviation of the erosion map based on projected LULC for 2022 was 4.59, only slightly deviating from the erosion map based on classified LULC with a value of 4.19.

This study represents the first-ever endeavor to assess soil erosion and sediment yield within the Ponnaniyar river basin. Due to variations in hydrological parameters compared to a nearby hydrometric station, the regionalization approach is unable to be applied effectively in the current research area. A prior study in the sub-tropical zone of Tamil Nadu (Shanmuka Nadi watershed) reported an annual soil loss estimate ranging from 0 to 3345 tons per hectare per year, a range that aligns closely with the estimates from our current study (Sujatha & Sridhar, 2018) followed by Veppanapalli watershed with mean soil loss of 25 tons/ha/yr (Ismail & Ravichandran, 2008), Manimuthandhi watershed with mean soil loss of 87.7 tons/ha/yr (Sathiyamurthi et al., 2023). Nevertheless, the outcomes of this study are in line with erosion rates observed in other sub-tropical environments across India, using the RUSLE model. For instance, in Meghalaya, Jena et al. (2018) recorded a mean loss of 36 tons/ha/yr. In Himachal Pradesh, Kumar et al. (2014) reported an average loss of 25.63 tons/ha/yr. In Uttarakhand, Kumar and Hole (2021) found a mean soil loss of 27.45 tons/ha/yr. In Karnataka, Chandramohan and Durbude (2002) reported mean soil loss of 25.98 tons/ha/yr. In Maharashtra, Sinha and Joshi (2012) found mean soil loss of 42.7 tons/ha/yr. In Kerala, Aswathi et al. (2022) reported an average loss of 31.52 tons/ha/yr. In Madhya Pradesh, Patil et al., 2017 reported mean soil loss of 9.84 tons/ha/yr. Similarly, in Jharkhand, Gayen et al., 2020mean soil loss of 47.44 tons/ha/yr.

For effective basin management, 13 SWs were delineated using terrain analysis in the GIS software. Soil erosion for each SW was determined and prioritized based on mean soil loss from 2014 to 2030. According to Table 13, SW 11 and SW 9 were found to be highly affected among the 13 SWs from 2014 to 2030. The severely affected SWs were identified using the grade mean method and found to be SW 11, SW 9, SW 12, and SW 1. These SWs are located near the downstream of the basin and are strongly influenced by surface runoff, resulting in high rates of erosion and yield. Additionally, these severely affected SWs have a higher percentage of settlement and barren land regions. In these SWs, the presence of loamy sand and sandy clay loam strongly influences the rate of erosion and yield in the past, present, and future. Based on grade average values, 31.10%, 9.58%, 32.05% and 27.27% of Ponnaniyar river basin were categorized into low, medium, high and extreme. The aforementioned severely affected SWs require immediate implementation of soil and water conservation measures to effectively reduce the rate of erosion and yield within the study area.

Conclusion

The results of the study reveal that the Ponnaniyar river basin experiences severe erosion and potential yield loss in the future determined by using CA-ANN, RUSLE-SDR, RS and GIS techniques. In this study, a machine learning–based RTC was employed to classify and improve the LULC classification. The MOLUSCE plugin used in this research enables the prediction of LULC in the years 2022, 2026, and 2030 using the CA-ANN method with classified LULC maps from the years 2014 and 2018, along with five variable factors, including elevation, slope, distance to settlements, distance to water, and distance to roads. The accuracy of the projected map was assessed against the classified LULC map for the year 2022. The accuracy level of the projected map was found to be 85.35% for OA and 0.74 for K. The projected LULC maps for 2022, 2026, and 2030 were incorporated into an empirical equation, i.e., the RUSLE and RUSLE-based SDR model, to assess erosion and yield in the Ponnaniyar river basin. The mean soil erosion in the Ponnaniyar river basin, determined between 2014 and 2030, increases from 23.99 to 27.63 tons/ha/yr. On the other hand, the mean sediment yield in the study area estimated between 2014 and 2030 increases from 4.49 to 5.25 tons/ha/yr. The ROC curve, a statistical method, was used to determine the accuracy rate of the erosion map based on classified and projected LULC maps in 2022, and it was found to have a satisfactory agreement with values of 0.697 and 0.686, respectively. For effective basin management, the Ponnaniyar river basin was delineated into 13 SWs using terrain analysis in the GIS platform. These SWs were prioritized based on mean soil loss between 2014 and 2030. Subsequently, the prioritized SWs were integrated using the grade average method, resulting in values ranging from 1.50 to 12.5, respectively. The severely affected SW within the study area was identified as SW 11 with a grade average value of 1.50, while the slightly affected SW was found to be SW 5 with a value of 12.5, respectively. Based on these grade average values, 31.10%, 9.58%, 32.05%, and 27.27% of the Ponnaniyar river basin were categorized into low, medium, high, and extreme levels of impact. From these findings, SW 11, SW 9, SW 12, and SW 1 are extremely affected by soil erosion, and immediate implementation of water harvesting structures is required for soil conservation. This information might be useful for decision-makers and policymakers in land management.