1 Introduction

Food security and agriculture sustainability are fundamental to feed the population that continues to grow worldwide. Sustainable development goals (SDGs)-2 pointed out the interconnected objectives to end hunger and achieve food security with improved nutrition (Jonathan, 2016). Unsustainable farming practices and indiscriminate use of soil resources are the major factors that lead to soil compaction (Pagliai, 2004), soil erosion (Montgomery, 2007), loss of soil fertility that seriously deplete the soil quality and make them more susceptible to degradation. In urban environment, accumulation of non-biodegradable waste (Alabi, 2019), excessive use of fertilizers (Gupta, 2019), and pesticides (Omran & Negm, 2020), increasing tourist load (Basak et al., 2021) and water logging risk () may lead to multi-dimensional environmental challenges. The Indian Himalayan Region (IHR) is also facing such challenges in growing urban areas and intensifying the impact of climate change, creating new constraints and potentials for achieving agriculture sustainability. The catchment area under the sub-basins of the IHR is more vulnerable that carries a unique sedimentary history and nutrient distribution owing to the influence of long-term evolutionary history and the prevailing environment (Kumar et al., 2021). The valley plains in basin are generally considered more productive, with a huge potential for uplifting the farmer’s economy, but at the same time facing multi-dimensional challenges due to shrinking arable land at the cost of expanding urban areas. The decline in agroecosystem productivity poses a significant hurdle in achieving agriculture sustainability and food security (Nunes et al., 2020).

A sustainable food and agriculture system requires a responsible nutrient management plan; however, many areas lack sufficient information on the available soil resources, particularly those covered with river basin, hills, and valleys. Furthermore, the drainage basin with a complex geological environment can strongly modulate the nutrient composition under varying bedrock geochemistry (Barre et al., 2017). Environmental variables such as climate, geology, topography, and anthropogenic inputs such as land use and fertilizers are considered to influence the site-specific nutrient requirements. A better understanding of soil properties is required to fill the existing knowledge gaps by delineating soil properties in varied landscapes and environment. The mapping of soil properties and their spatial distribution is a prerequisite for sustainable management of soil resources and precise nutrient management. The delineation of soil fertility and their spatial distribution is also important for river basins to better understand sedimentary environment.

Many researchers have widely used remote sensing and Geographic Information System (GIS) techniques to understand the multi‑influencing factors for sustainable agricultural management and development (Ramzan et al., 2017; Roy et al., 2021a, b, 2022; Vasu et al., 2017). The geostatistical methods are extensively utilized to optimize soil sampling procedures and efficiently manage soil resources (Bhunia et al., 2018; Mousavifard et al., 2013). Many researchers also integrated spatial modeling in a varied physiographical regime to understand spatial autocorrelation (Davatgar et al., 2012; Sharma & Sood, 2020). So, assessing the soil nutrient status and their spatial variation seems worthwhile to understand the influence of intrinsic (within the soil) and extrinsic (outside soil) factors. The recent spatial technology allows for handling huge spatial data for effective soil resources management. Information on the spatial distribution of soil properties is essential for developing a site-specific management plan matching the location-specific requirements (Mehra & Singh, 2016). Many previous studies extensively developed the area having homogenous soil quality with management zones (MZs) for sustainable resource management (Davatgar et al., 2012; Nawar et al., 2017; Ali et al., 2022; Shashikumar et al., 2022). In soil quality assessment, such multi-variable interaction evaluates homogeneity with MZs for maintaining long-term productivity in agroecosystem.

This paper hypothesizes that spatial variability of soil properties and delineation with management zones are crucial for prioritizing sustainable development in river basins. With the above background, the soil sampling was carried out from arable land of Balh Valley and its catchment area named Suketi basin, lower Himachal Himalaya, India. The objective and significant contribution of the study were (a) to generate the soil fertility maps to represent status of soil properties and nutrients and (b) to optimize and prepare the soil MZs using fuzzy clustering and principal component analysis (PCA) for precision-based crop management intensive farming system under valley sub-regions. The study results are expected to contribute in site-specific nutrient management of the valley and its catchment area. Furthermore, the outcomes could be beneficial to protect the Himalayan ecosystem from long-term environmental hazards from indiscriminate use of fertilizers.

2 Study site description

The study area is part of the Mandi district covered under a mid-hills-sub-humid zone and high hills temperate wet agroclimatic zone. As per Koppen’s climate classification, the study region comes under a humid subtropical climate (Peel et al., 2007). The maximum rainfall in the region occurs from June to September, with an annual average rainfall of 1568.5 mm (DOA, 2009). The region is dominated by brown hill and sub-mountain soils, which are slightly acidic and have sandy loam to clay loam texture. Taxonomically, the valley area has Entisol and Inseptisol soil orders, while the hilly regions are occupied by Alfisol, Inseptisol and Mollisol soil orders (ATMA, 2005; DOA, 2009). The sampling sites covered the area from the Balh Valley and its catchment area (Suketi basin), situated in the lower Himachal Himalaya (Fig. 1). The Suketi river merges with the Beas river and finally becomes a part of the Indus river system. The surface elevation of the Balh Valley ranges from 790 to 2900 m above sea level (Pophare & Balpande, 2014) (Fig. 2a). The flat area of the valley region is known for growing vegetables with extensive use of chemical fertilizers and pesticides. Furthermore, the area faces soil deterioration from extensive agriculture, pollution from growing urbanization and increasing vehicular emissions. The areas under the moderate hilly terrain are utilized mainly for sustenance agriculture by the farmers owing to small landholdings. The farmers mainly depend on rain-fed agriculture. Most cultivable lands in the extreme fringes are not fully utilized due to lack of irrigation. In addition, the heavy rainfalls in monsoons cause erosion of the productive top soil, and in addition, the anthropogenic pressure due to growing urbanization also makes the area highly vulnerable.

Fig. 1
figure 1

Location map of the study area and sampling sites (geographical coordinates; WGS 1984) with respect to Himachal Pradesh, India

Fig. 2
figure 2

a Digital elevation map (source: USGS) b lithological map c steam network with sub-basin boundaries (modified after Pophare & Balpande, 2014) d slope and e Topographic Roughness India (TRI) Balh Valley and its catchment area

2.1 Geological setup and physiography

Geologically, the study region is part of the long-term evolutionary history of the Himalaya. The geodynamic and the evolution are controlled by major thrust systems, namely the Himalayan Frontal Thrust (HFT), the Main Boundary Thrust (MBT), the Main Central Thrust (MCT), and the South Tibetan Detachment System (STDS), responsible for dividing the area into different tectonostratigraphic domains (Singh & Patel, 2017). Regionally, the major thrust bifurcates into associated thrust systems. The study area is juxtaposed along with the two thrusts, namely the Main Central Thrust (MCT) and the Mandi Thrust (MT) (Srikantia, 1987; Vaidyanadhan & Ramakrishnan, 2008). The area under these litho-units is characterized by geological succession from different epochs with diverse sedimentary environments and soil geochemistry. The study area has Lesser Himalayan Tectogens characterized by the sediments from Precambrian to the Quaternary period (CGWB report; https://cgwb.gov.in). The lithology exposed along the Suketibasin reveals that the rocks in the eastern parts are dominated by granite, gneisses, quartzites and phyllites, whereas, in the western part, the major rock types are sandstone, schists, and limestone, etc. (Pophare et al., 2018). The various litho-tectonic units comprised of igneous and metamorphic rocks are presented in Fig. 2. The geological setup and the surrounding parent rocks are the major factors influencing the intrinsic soil properties from long-term sedimentation. The inter-montane flat area between the two major towns, Mandi toward the north and Sundernagar toward the south, is popularly named Balh Valley.

This valley covers a 79 km2 area surrounded by moderate hilly terrains, covering a catchment area of 343 km2 and a total geographical area of 422 km2 under the Suketi basin (Pophare et al., 2014). The Balh Valley is underlain by recent Quaternary sediments comprising alluvium, boulder, cobbles, pebbles, gravel, and sand, while the hilly terrains are underlain by phyllite, sandstone, and granites. Physiographically, the Suketi basin is highly dissected by the number of small rivulets (Khad), namely Chail khad, Kansa khad, Gangli khad, Dadour khad, and Ratti khad, originated from number of perennial to seasonal drainages and finally merges to the Suketi river. The Suketi basin is characterized by five major sub-basins namely Ratti khad sub-basin (RKSB), Dadour khad sub-basin (DKSB), Gangli khad sub-basin (GKSB), Kansa khad sub-basin (KKSB), and Suketi trunk stream sub-basin (STSSB) contains sediments of varied thicknesses and possibly nutrients composition (Fig. 2c). The morphometric analysis of the Suketi basin reveals that the drainage network constitutes a 7th order, which is fed by rain and snow water (Pophare et al., 2014). The slope and the Topographic Roughness India (TRI) were also plotted to relate with surface flow discharge and runoff velocity (Fig. 2d,e). The surface runoff from small rivulets can substantially affect the soil nutrient dynamics in a fluvial hydro-sedimentological environment.

3 Materials and methods

3.1 Soil sampling and analysis

The soil samples were collected from 468 sites from Balh Valley (nos. 207) and its catchment area (nos. 261). The geographical coordinates of each site were recorded with a handheld global positioning system (GPS) for generating surface maps of the calculated soil properties. The surface (0–15 cm) soil samples were collected randomly from the agriculture field with a steel augur/steel spatula in October and November months after the crop harvest and before the addition of subsequent fertilizer for the next crop. As mentioned above, the study area is characterized by various physiographical constraints such as hills with steep slopes, sub-basins and valley sub-regions, so grid sampling was not appropriate and random sampling to the approachable arable land was done. Through randomly collected samples finally, a composite sample was prepared from each site by mixing soil from five locations, one from the center and four from 3–4 m away from the center in four cardinal directions (Barre et al., 2017). The soil samples collected from different locations were put into the clean plastic tray, and all visible living organisms and pebbles were removed before being put in a plastic bag for further storage. After returning to the laboratory, the collected soil samples were preprocessed. The soil moisture was exhausted by air drying, and larger aggregates were broken with a wooden pestle and passed through a 0.2 cm sieve to separate the coarse fractions. The processed soil samples were stored in plastic bottles and labeled for further analysis. The analysis was done by following the standard methods used in soil analysis; name of the soil, physio-chemical properties, methods and instruments used are given in Table 1.

Table 1 Standard methods and the instrument used for the soil analysis

3.2 Conventional statistics and geostatistical analysis

The data generated through the chemical analysis of the soil samples were statistically analyzed for descriptive statistics, namely minimum (Min), maximum (Max), mean, coefficient of variation (CV), skewness, and kurtosis using Statistical Package for the Social Sciences (SPSS) software. Descriptive statistics is conventionally used to identify the variation and highlight the relationship between soil variables. The correlation matrix was calculated to understand the association of soil variables. The Kolmogorov–Smirnov (K-S) test and the quantile-quantile (Q-Q) plots were used to check the normality in the dataset. The normalization of the skewed data is recommended to get reliable results in spatial analysis (Armstrong, 1998; Varouchakis, 2021). The data were considered normal at K–S test with P > 0.05. The log-transformation and the Box-Cox transformation are the two methods extensively used by researchers for achieving normality in the dataset (Box & Cox, 1964; Gallardo & Parama, 2007). Though all the datasets could not achieve normality, log-transformation was applied before passing through the geostatistical analysis.

The geostatistical methods are extensively used for the analysis, processing, and representation of spatial data that provide the ability to distinguish stable (conservative, such as morphological, physical, and mineralogical soil feature) and dynamic (soil regimes, such as soil water, temperature, and gas) soil variables for sustainable decision making (Goovaerts, 1999; Meshalkina et al., 2007; Mulla, 2012). The uncertainties in the interpolation procedures were used to find the value of unsampled locations, which further depend upon the site conditions, model, and the parameters used in the analysis (Fatemeh & Gregoire, 2021). The empirical variogram estimator was applied to the spatial data after scrutinizing parameters that violate the assumptions of geostatistics (Goovaerts, 1997; Webster & Oliver, 2014). The major steps for the data preparation are trend analysis, error detection due to outliers, removal of non-stationary spatial data with the moving window statistics, and interactive analysis (Tesfahunegn et al., 2011). This interactive analysis of semivariogram clouds in geostatistical procedures allows the detection of anisotropies and spatial dependency of calculated soil variables. The different theoretical models were tested to find the model that fits best with the experimental model. The models were tested in semivariogram modeling and kriging interpolation technique (Webster & Oliver, 2001) (Eq. 11).

$$\gamma \left( h \right) = \frac{1}{2m\left( h \right)}\mathop \sum \limits_{i = 1}^{m\left( h \right)} \left[ {Z\left( {X_{i} + h} \right) - Z\left( {X_{i} } \right)} \right]^{2}$$
(1)

The ordinary kriging interpolation method was found appropriate for unbiased predicting the value of the unsampled locations by decreasing the variance error (Lin & Chang, 2000; Tesfahunegn et al., 2011; Tamburi et al., 2020). In the experimental variogram of the kriging procedure, h is the separation distance referred to as the lag, which is half the average squared difference between the value at z(xi + h) and the z(xi) (Lark, 2000; Robinson & Metternicht, 2006), while γ(h) represent the magnitude of lag distance between two sample locations. The nugget (C0), partial sill (C), sill (C0 + C), range (m), and nugget/sill ratio derived from semivariogram analysis help to check the spatial dependency to describe the spatial correlation (Cambardella et al., 1994; Vasu et al., 2017). The performance and the accuracy of the spatial interpolation were tested using the cross-validation approach (Robinson & Metternicht, 2006; Schepers et al., 2004) with the mean error (ME) (Eq. 2) and root mean square error (RMSE) (Eq. 3).

$${\text{ME}} = \frac{1}{N}\mathop \sum \limits_{i = N}^{N} \left\{ {z\left( {x_{i} } \right) - \hat{z}\left( {x_{i} } \right)} \right\}$$
(2)
$${\text{RMSE}} = \sqrt {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left\{ {z\left( {x_{i} } \right) - \hat{z}\left( {x_{i} } \right)} \right\}^{2} }$$
(3)

The following semivariogram models, namely Gaussian, exponential, linear and spherical (Eqs. 47) (Asghari et al., 2017), were tested with the experimental results.

Gaussian model:

$$Y\left( h \right) = C_{0} + C\left[ {1 - \exp \left( {\frac{{ - h^{2} }}{{A_{0}^{2} }}} \right)} \right]$$
(4)

Exponential model:

$$Y\left( h \right) = C_{0} + C\left[ {1 - \exp \left( {\frac{ - h}{{A_{0} }}} \right)} \right]$$
(5)

Linear model:

$$Y\left( h \right) = C_{0} + \left[ {h\left( {\frac{{C_{0} + C}}{{A_{0} }}} \right)} \right]$$
(6)
$${\text{if}}\;~h \le A_{0} ,\gamma \left( h \right) = C_{0} + C\;~{\text{if}}\;~h > A_{0}$$

Spherical model:

$$Y\left( h \right) = C_{0} + C\left[ {1.5\frac{h}{{A_{0} }} - 0.5\left( {\frac{h}{{A_{0} }}} \right)^{3} } \right]$$
(7)
$${\text{if}}~h \le A_{0} ,\gamma \left( h \right) = C_{0} + C\;~{\text{if}}~h > A_{0}$$

The best fitted semivariogram model was selected with a cross-validation approach and applied for interpolating the spatial data using geostatistical analysis tool. Finally, the map projection of Universal Transverse Mercator (UTM) and datum of the world geodetic system (WGS) 1984 was used to generate spatial maps in GIS platform.

3.3 Principal component analysis and fuzzy c‑means clustering algorithm

Principal component analysis (PCA) is widely used as dimensional reduction for deriving smaller groups from multivariate data (Ivosev et al., 2008; Bro, 2014). The PCA analysis was used to generate a small cluster from the sampling point called principal components (PCs). However, the principal components (PCs) having value ≥ 1·0 are considered to explain the variance in the dataset. So, these dominant PCs were used in fuzzy c-means clustering to generate the management zones (MZs). The analysis was carried out in FuzME software by selecting Euclidean metric distance and corresponding values for different parameters such as fuzzy exponent 1.30, maximum iteration of 300 and stopping time 0.0001. To derive the best cluster number, the cluster validity functions such as fuzzy performance index (FPI) and normalized classification entropy (NCE) were used that explain the degree of fuzziness and amount of disorganization, respectively (Boydell & Bratney, 1999; Metwally et al., 2019), as given below (Eqs. 8, 9)

$${\text{FPI}} = 1 - \frac{c}{c - 1}\left[ {1 - \frac{{\mathop \sum \nolimits_{i = 1}^{c} \mathop \sum \nolimits_{k = 1}^{n} \left( {\mu_{ik} } \right)^{2} }}{n}} \right]$$
(8)
$${\text{MPE}} = - \frac{1}{n\log k}\mathop \sum \limits_{i = 1}^{n} \sum\limits_{j = 1}^{k} {\mu_{i,j}^{m} } \log \left( {\mu_{ij}^{m} } \right)$$
(9)

where c is the number of clusters, n is the number of observations, μik is the fuzzy membership, m is the weighting exponent, and log is the natural logarithm. The optimum number of clusters selected in the MZs class was identified with a minimum value of the MPE and FPI. Furthermore, the one-way ANOVA test was performed to determine heterogenic variations (p ≤ 0.05) to identify which group means are significantly different from each other.

4 Results and discussion

4.1 Descriptive statistics and correlation among soil properties

The results of the descriptive statistics of the soil properties, namely pH, electrical conductivity (EC), organic carbon (OC), nitrogen (N), phosphorus (P), potassium (K), calcium (Ca) and magnesium (Mg) and DTPA-extractable zinc (Zn), iron (Fe), manganese (Mn) and copper (Cu) derived from the analysis for the catchment area, are presented in Table 2. The results showed that the range of pH varied from acidic to alkaline. The mean value of the EC and OC was 0.29 dS/m and 12.02 g/kg, respectively. The EC is an important parameter of the soil properties utilized in many geophysical soil prospecting surveys, such as electrical resistivity imaging surveys to delineate the source of the pollutants, and in soil fertility, the variation is correlated with the availability of other nutrients in the soil (Sulaiman & Ahmed, 2001). In agriculture, the optimal requirement of EC is also crop-specific, and the range between 0.08 and 0.70 dS/m was found within the safe limit (< 0.80 dS/m) for the supply of nutrient solution in the root zone of the crop. The lower EC may cause serve health issues to plant growth and yield (Ding et al., 2018). The range toward the higher side can be correlated to the nearby source of soil pollution, leachate etc., and in the study area, the extreme value toward the higher side in some of the locations can be correlated with the influence of the point sources along the national highways (NH)-21and growing impacts of urbanization. Overall, in the study area, the available P and K were found in the medium range while considering their values between 4.9–11 and 52–125 mg/kg, respectively. All the soil properties are normally distributed and show non-significant skewness except Mn, which is positively skewed with a value of 1.44. The coefficient of variation calculated for the soil properties ranged between 7.61–55.23%. A low coefficient of variation (CV) for the pH was noticed due to the use of logarithmic concentration of photons to represent pH value in the soil is also discussed by many researchers (Kumar et al., 2021; Mousavifard et al., 2013). The CV was classified using criteria set by Wilding and Dress (1983) for the coefficient of variation (CV) as low (CV < 15–35%), medium (CV = 15–35%) and high (CV =  > 35%), respectively. In major nutrients, namely N, P, and K, the N was found in a medium range with higher variation of the CV than the other major nutrients indicating the varied nitrogenous fertilizers input in the study area. The Ca and Mg ranged from 264 to 627.4 and 109.8–555.4 mg/kg, respectively, while among the DTPA-extractable micronutrients, Fe, Mn, and Cu represented the higher variation pertaining to the inherent soil properties and soil management practices. The value of skewness and kurtosis deviated considerably from their standard value of 0 and 3.The geological modulation, land use pattern, slope, and anthropogenic development in the valley sub-regions strongly influence the soil properties. The variation in the soil chemical properties was attributed to the management practices, parent material of the surrounding catchment, and the irrigation water quality (Khan et al., 2021).

Table 2 Descriptive statistics of soil parameters along Balh Valley and its catchment area (Suketi basin) Himachal Himalaya, India

The Pearson’s correlation coefficient was calculated and a significantly positive correlation was found between OC and N (r = 0.57, p < 0.05), pH and EC (r = 0.41, p < 0.05), and Ca and Mg (r = 0.23, p < 0.05) (Table 3). Stronger and significant correlation was observed in the case of OC and N as compared to other soil properties. Bhunia et al. (2018) also reported that the natural circulation of organic matter and the activity of soil microorganisms determine the nitrogen concentration in soil. The analysis depicts the average concentration of DTPA-extractable micronutrients such as Zn, Fe, Mn and Cu was found in the sufficient range considering the critical limit of 0.6, 4.5, 1 and 0.2 mg/kg, respectively. The micronutrients were found sufficient, except Zn, which represents a lower value in the area toward the northern fringes of the catchment area. Many researchers also pointed out that the organic matter build-up influences the micronutrients dynamics and distribution in the soil (Dhaliwal et al., 2019; Ojha et al., 2018). Moreover, the intrinsic factors mainly operate within the homogeneous landscape, and the extrinsic factors cover the influence of the anthropogenic inputs relatively influences the soil characteristics.

Table 3 Pearson’s correlation analysis of the calculated soil properties and heavy metals in the study area

4.2 Geostatistical analysis

The spatial correlation was identified with geostatistical analysis that involves calculating semivariogram and their respective best fitted model. The soil properties which skewed from the normal distribution were log-transformed prior to geostatistical analysis. The results of the geostatistical analysis are presented in Table 4 and Figs. 3 and 4. The model that fits best with the soil property was applied to each parameter, and the accuracy was estimated through various error estimates by interpolating the value at unsampled locations. The soil pH, EC, P, K, Ca, Mg, Zn, Fe, and Mn were modeled with the Gaussian model, soil OC and available N with the exponential model, while Zn and Cu with spherical model. The best-fit semivariogram models are shown in Fig. 4. The quantile–quantile (Q-Q) plot exhibits the distribution of the actual and predicted values. The Gaussian, spherical, and exponential models were also identified as best-fit-model for delineating the spatial structure and autocorrelation of soil properties in similar physiographical regimes (Shashikumar et al., 2022). The fitting of the model depends upon many factors, such as the local site conditions and the anthropogenic interventions. The nugget value represents micro-variability and measurement of variance due to errors, and in the present study, it varied from 0.01 to 9216.1 for EC and Mg, respectively. These values are approximately similar or even lower (Behera et al., 2016) to the range observed in the Indian soils. The guidelines suggested by Cambardella et al. (1994) were used to check the spatial dependency and classify the nugget/sill ratio C0/(C0 + C), of soil attributes into three classes such as weak (> 75%), moderate (25–75%)and strong (< 25%). The present area defines the calculated soil variables into weak to moderate soil classes. Many researchers also pointed out the inherited limitations in the criteria as the range effect needed to be considered in the analysis (Davatgar et al., 2012; Weindorf & Zhu, 2010). The higher nugget to sill ratio represents the effect of stochastic factors, such as fertilization, cropping system and human intervention, while the lower spatial dependence suggests the structural factors, such as climate, the influence of the parent rock, topography, vegetation association as well as anthropogenic activities (Bhunia et al., 2018; Verma et al., 2021). The results imply spatial dependencies come under weak to moderate classes might be contributed by the extrinsic factors covering human-induced activities. The semivariogram parameters and their spatial autocorrelation are presented in Table 4. The range also covers the influence of landscape and distance of the calculated soil variables, and for the present study range varied from 191 to 9504 m. A large range in the semivariogram analysis indicates the sample locations are spatially autocorrelated over a large distance compared to the soil variables, which have a smaller range. The lower range values of the OC (347 m), P, K, Ca, Zn and Cu suggest that the attributes are more closely related than the higher range value in the modeling. The higher range was noticed with Mg (8.9 km), pH (9.50 km), Mn (8.699 km) and N (5.30 km) representing the wider spatial influence of rainfall and anthropogenic onslaught on soil resources. The spatial modeling further reveals that the sampling distance more than the range will influence the spatial correlation; that was also pointed out by Kerry and Oliver (2004) that the sampling distance should be half of the semivariogram range. The contribution of the physiographic variation for the moderate and weak spatial dependency has also been noticed by many researchers (Davatgar et al., 2012; Laekemariam et al., 2018).This also seems true for the study area that occupies varied agricultural practices, fertilizer input, and urban pressure contributing to the varied spatial distribution of the soil properties. This spatial information can be utilized for modifying the future sampling procedure by the farmers, policy makers and other stakeholders to frame an appropriate decision support system for sustainable intensification of agriculture in growing urban centers.

Table 4 Variogram models of the calculated soil properties in study area
Fig. 3
figure 3

Quantile-quantile (Q-Q) plot for the calculated soil properties of the study area

Fig. 4
figure 4

Best fitted semivariogram model for the calculated soil properties of the study area

4.3 Spatial distribution of soil properties

The cross-validation approach was used to select the parameters for the ordinary kriging interpolation to generate the surface maps. Figures 5 and 6 represents the spatial distribution of the calculated soil properties, such as soil pH, EC, OC, and major nutrients, namely available N, P, K and DTPA-extractable micronutrients such as Zn, Fe, Mn and Cu. The surface maps for the soil pH show that the extensive area lies in the neutral range (6.51–7.0) and higher toward the area under GKSB (Fig. 2b and 5). This variation in soil pH might be due to the irrigation water quality and current agriculture practices. The pH in the acidic range was also noticed in many parts of the Himachal Himalaya, India (Kumar et al., 2021). The spatial distribution map of soil EC represents the concentration was high toward the southern part of the Balh Valley. A few isolated locations represent the OC, and available N was found in the low range considering the critical limit of OC < 5 g/kg and N = 50–100 mg/kg. The OC, N and K distribution was found higher toward the northern extreme of the catchment area. The soil OC and N show a similar trend, indicating the use organic matter/FYM that contributes to dynamic balance of soil organic carbon and nitrogen in the study area. Furthermore, the spatial distribution map of the soil N and K also showed a similar trend covering the locations bordering the Balh Valley. The concentration of the available P is sufficient in the center of the catchment, while the availability of K is low, indicating heavy use without subsequent addition and current needs for efficient K management. The soil pH, EC, and OC were found to be highest toward the southern part of the study area. In major nutrients, the concentration of available N was recorded highest toward the southern part, while the available P represents a high concentration in the valley sub-regions with few exceptions in the central part. Some of the sites located in the north and central part of the GKSB catchment area represent the higher concentration of soil OC and available N, indicating their frequent addition and its availability from the fluvial environment from the surrounding area. The concentration of the available K was generally low in the central part of the Balh Valley, representing high utilization of available K by the vegetable crops without sufficient addition that certainly require efficient management from various stakeholders. The kriging interpolation maps of many soil variables showed a patchy distribution in the study area that may be attributed to the variation of landscape, topography, hydrology and geochemistry of the parent rock (Kubler et al., 2021).The soil classes in the red color represent some deficient areas for Ca and Mg, considering the critical limit of 300 mg/kg and 180 mg/kg, respectively. The Ca represents the lowest concentration toward the central part of the Balh Valley, while Mg shows its dominance in isolated locations. Furthermore, the DTPA-extractable Zn, Mn, and Cu also represent a higher concentration in some of the sites, such as the area underlying with Shali Formation covering the GKSB (Figs. 2b and 6). The Zn shows a high concentration toward the southern parts, while the other micronutrients show variation in isolated locations with few exceptions toward the extreme fringes.

Fig. 5
figure 5

Spatial distribution maps for the soil properties namely soil pH; EC: electrical conductivity; OC: organic carbon; N: nitrogen; P: phosphorus and K: potassium Balh Valley and its catchment area

Fig. 6
figure 6

Spatial distribution maps for the soil properties namely secondary (Ca: calcium; Mg: magnesium) and micronutrients (Zn: zinc; Fe: iron; Mn: manganese; Cu: copper) for the Balh Valley and its catchment area

4.4 Principal component analysis and soil fertility management zones

The PCA analysis was performed to group the variables with similar traits. There are many principal components (PCs) formed; however, the PCs with an eigenvalue > 1 were retained for final analysis that represents the cumulative loading of 55.77% (Table 5). The PCs with eigenvalues > 1 explain more variance than the other variables. These criteria allow selection of five PCs accounting for maximum contribution to quality improvement as per the action plan of the study region. The PC1 contributed 15.91%, dominated by N, OC, Mg and Fe and the PC2 contributed 12.37% cumulative variability dominated by EC and pH. The five PCs were also reported in PCA analysis to determine the relative magnitude of anthropogenic and natural sources for the watershed area in northeast Iran (Khaledian et al., 2016). The PCA analysis shows that the available N, OC, Mg and Fe are the deciding factors that need prioritization for effective nutrient management and sustainable development in the Balh Valley and its catchment area. In biplot, the analysis represents OC and N as positively correlated with the altitude (Fig. 7). In biplot chart, the soil properties which act similarly can be grouped to support prioritization and decision making for precise nutrient management to achieve the long-term agricultural sustainability.

Table 5 Principal component analysis of the soil variables in the study area
Fig. 7
figure 7

Principal component analysis (PCA) biplot (PC1 and PC2) between altitude and the soil properties

As discussed earlier, the use of fuzzy performance index, such as FPI and MPE, aid in optimizing the MZs. A similar concept was utilized to obtain the optimum number of MZs by plotting NCE and FPI with the number of zones (Fig. 8). The corresponding minimal value of the NCE and FPI in the figure was used to plot management zones/classes. Looking into these specifications, two management zones were selected, and the resultant surface map is given as Fig. 9. The demarcation of management zone in the catchment scale reflects lower number of homogenous zones. Two MZs were also noticed by many researchers in the delineation of soil properties and yield zone (Damian et al., 2017; Tagarakis et al., 2013). The two MZs zone indicates less heterogeneity in the catchment scale attributed to similar conditions in the fluvial environment. The MZs are beneficial to provide site-specific inputs for precision agriculture. The comparisons of mean values are given in Table 6. The absence of a statistical difference in the mean value could be linked to the low variation of calculated soil properties in the study area. However, the nutrients having concentration lower than the limiting value as per specific crop requirement need to be considered that can influence long-term sustained agriculture productivity.

Fig. 8
figure 8

Optimum number of management zone with respect to minimal Fuzzy performance index (FPI) and normalized classification entropy (NCE)

Fig. 9
figure 9

Management zone map for the Balh Valley and its catchment area, Himachal Himalaya, India

Table 6 Comparison of mean values in the two management zones the fuzzy c-means algorithm for the study area

5 Conclusions

The anthropogenic and geological modulation of the soil properties is always a subject of interest for productive areas under valley sub-regions. Information on the spatial variability of soil properties is essential for conserving soil resources and reducing excessive use of fertilizers to eliminate the adverse effect on environment. Geostatistical modeling was used for semivariogram analysis and to generate spatial variability maps, whereas the dimensionality was reduced with PCA analysis, and finally, the MZs map was prepared using fuzzy c-mean clustering. The low concentration of the major nutrients, namely N, P, K, and DTPA-extractable Zn, was the major constraining factor for crop growth and production. The OC content and primary nutrients N and P except K are higher in the piedmont alluvial plains compared to its catchment area. While some of the micronutrients showing area-specific dominance as per the underlying geology, drainage pattern and anthropogenic intervention. The spatial correlation of the analyzed data shows its variation from weak to moderate, which indicates the influence of intrinsic (e.g., soil parent material and texture) and extrinsic factors (e.g., soil management and anthropogenic inputs). The variation of the fertility parameters at the catchment scale indicates the requirement for location-specific management of soil resources. This study delineates two homogenous areas as management zones for adopting site-specific fertilizer recommendations. The study further suggests more planned sampling under different land uses for capturing the influence of parent rock and soil management. The outcomes from the study can be used as an introductory guide for various stakeholders contributing to environmentally-sound management of soil resources.