Introduction

The anthropogenic activities and the growing tendency of industries are major sources of environmental pollution (Ali et al. 2013). The quality of hydrosphere, lithosphere, pedosphere, biosphere, and atmosphere is being affected by the environmental pollution (Luo et al. 2011). The soil is considered as basic environmental component, and it is a very important ecological path for bioaccumulation of minerals and metals to the food chain components. The soil contamination may refer to the inclusion of different toxic elements and chemical into the soil as a result of the natural source or human activities which may cause adverse effects on the environment and human health. The volume of wastewater and untreated industrial effluent is increasing due to increasing population and industrialization in the developing countries. Different chemicals and extensive amount of water are consumed in industrial procedures which are discharged as untreated effluents, containing poisonous metals, for example, sulfides, chromium, lead, zinc and cadmium (Gu et al. 2012). The improper waste management and lack of standards for wastewater discharge and industrial effluent are worsening the environment and causing pollution in the irrigated soil (Hargreaves et al. 2008). The irrigation practices by industrial effluent may increase the heavy metals concentration in soil, and it also adversely affects the soil salinity, electrical conductivity, total dissolved solids (TDS), chloride contents, biological oxygen demand, pH, alkalinity and phosphate (Soodan et al. 2014).

Different researchers investigated that untreated industrial effluent is used to cultivate cereals and vegetables in irrigation practices, due to unawareness of local farmers (Lourenço et al. 2010). The regular extensive use of industrial waste for irrigation purpose may contribute heavy metals in soil (Islam et al. 2017). There is a range of dynamic factors such as soluble substances, soil pH, soil type, the use of fertilizers, cultivated plant species and composition of industrial effluent which badly accumulate the heavy metals at higher levels in the environmental media of groundwater, plants and soil (Zhu et al. 2011). The heavy metals may enter into the human body by food chain, and it is considered as an important entrance pathway (Soodan et al. 2014). There is variant potential in many plant species and edible cereal to store heavy metals at concentrations (Xiao et al. 2017). In this way, heavy metals also accumulate in different organs of the human body causing serious health issues. Heavy metals have half-lives in the long biological system and are non-biodegradable (Chabukdhara et al. 2016). The heavy metals and their compounds are of soluble nature in water, which may readily increase the contamination to the unsafe levels. Li et al. (2013a) investigated the soil pollution, and they analyzed that chemical speciation of toxic heavy metals and contamination levels of Zn, Cu, Pb and Cr have seriously affected the environment and caused pollution to soil and sediments in the vicinity of industrial area over the past decades.

The possible environmental risk and pollution caused by heavy metals accumulation in soil were raised over time. It was studied that metal contamination was at a higher level in the adjacent areas of an imperfect e-waste processing unit and the highest concentrations of Zn, Pb, Cu and Cd pollution in soil were found at the burning sites. The uninhibited operations of e-waste processing unit triggered serious toxicity to the neighboring soil and cultivated vegetables. Some previous studies investigated environmental pollution with a focus on heavy metals concentration and their spatial variability in soil (Akortia et al. 2017). The continuous intake of heavy metals, however, at low dose causes bioaccumulation and a significant adverse effect on human health by some elements such as Cd, Pb and Cr. The assessment of soil properties and broad spatial analysis of heavy metals for contamination of free vegetables and cereals are a crucial concern for safe ecological paths. So, in arable lands, it is important to identify the sources of heavy metals for control and prevention of contamination in different agro-ecosystems.

In previous studies (Nawaz et al. 2006) classical statistical approaches have also been used for analysis of soil heavy metal contamination. Multivariate analysis and exploratory statistics are valuable methods to identify the common sets in distribution patterns of data, which reduce the preliminary dimensions of datasets and make it easy to interpret (Kelepertzis 2014). Ha et al. (2014) also applied many statistical methods and multivariate analysis in agricultural soils and sediments, dust and urban soils. Principal component analysis (PCA) and multivariate analysis methods are effective techniques to transmute an original set of variables into a reduced subset, and the original dataset is represented by the systematic way of information (Lv et al. 2015). However, the PCA identifies the contamination source and shows the results in easy to interpret and logical manner by visual assessment of loading plots and data scores. But geographically the sources of contamination and spatial distribution may differ due to varying development of history and local environments.

So, spatial variability of heavy metals is assessed by using the advanced analysis of geographical information system (GIS) providing source contamination and evidence of non-point pollution extent. Normally, in the previous different spatial studies, a wide range of geostatistical tools was used. However, these methods determined the less significant factor for spatial dependency to estimate the spatial correlation for positive or negative dependency level. Furthermore, the common geostatistics cannot identify the outliers in the spatial data of different heavy metals. The outliers show the significant change values from the nearby locations in the surrounding of a point, and these outliers may induce uneven semivariogram pattern. Lu et al. (2012) applied general geostatistical and multivariate methods to assess the source and concentration of heavy metal pollution levels (Lu et al. 2012). The integrated heavy metals contamination and spatial distribution have been widely evaluated by using geostatistical methods in soil contamination studies, and these assessments and approaches are usually less comprehensive. Conversely, an efficient geostatistics method, global Moran’s I, can be used to estimate the spatial strength and spatial correlation to determine the positive or negative values of a spatial variable. The local Moran’s I can be used to identify the spatial outliers and spatial clusters of given values for distribution of heavy metals. Moreover, the significance value of the spatial autocorrelation for spatial data can be tested. However, the different geostatistical methods for spatial autocorrelation were established many years before and many researchers have used in their studies for spatial analysis, and the global Moran’s I was also used for the spatial variability analysis of heavy metals. However, Huo et al. (2011) established an inventive understanding of the structure for spatial dependence by analyzing the multiscale spatial variability of heavy metals in soils by using the Moran’s I geostatistics. The spatial autocorrelation statistics and geostatistics were used to identify the spatial pattern of heavy metals in the agricultural soils of Beijing, and the autocorrelation was performed using Moran’s I statistic (Huo et al. 2011). The multilevels spatial autocorrelation is not tested before for a heavy metals concentration at three depths of soil profile, and it is a novel approach in soil contamination studies.

In the industrial multitude areas, the spatial diffusion of heavy metals and resulting soil contamination is because of traditional methods of soil partition and poor management from decades. The single factor-based index methods and general spatial distribution analysis are less efficient to reveal heavy metal contamination in the soil instigated by significant anthropogenic activities by human, industrial and mining activities (Santos-Francés et al. 2017a). In previous researches, the built-in pollution index has been used (Takaki et al. 2017) as the most commonly used method for soil heavy metals analysis (Santos-Francés et al. 2017b). Therefore, it was required a very advanced and comprehensive method to analyze the diverse factors (Zang et al. 2017) for soil bioaccumulation at different subsurface depths in different landforms and obtain exact estimation results of contamination load.

In this study, spatial quantification method was adopted for soil heavy metal contamination induced by industrial effluent and wastewater based on the combined geostatistical approach and pollution index methods of Igeo and Nemerow index. The geoaccumulation index (Igeo) was assessed for heavy metals contamination and background values of heavy metals to categorize the area at different contamination scales. This index considers the factors for soil pollution which are sensitive to environmental contamination. The Nemerow index was also assessed to integrate the mean index of heavy metals content in the soil (Malkoc and Yazici 2017). Moreover, the index-based heavy metals contamination was plotted using geostatistical mapping. Previously, most of the researchers adopted kriging interpolation technique to monitor the spatial variability of soil heavy metals. However, in this research, very emerging and innovative mapping techniques by using Bayesian interpolation were employed to visualize the spatial extent of index-based heavy metal contamination in the study area, which, in turn, enables the influence of human activities on soil heavy metals contents to be spatially highlighted.

In comparison, most of the studies conducted in the heavy metal pollution either use only surface layer of soil and use the classical statistical technique to analyze the data. Based on the above explanation it was believed that present research would add new knowledge to the literature of soil heavy metal contamination and in general pollution studies. The main objective of this research was to identify source and linkages of heavy metals in soil and to explore their spatial variability in Sheikhupura district (Punjab) of Pakistan which is near the provincial capital and important hub of industrial activities. The soil samples were taken in the pre-monsoon season during 2015–2016 for the assessment of chromium (Cr), cadmium (Cd) and lead (Cd) and other physiochemical properties of soil. Furthermore, this study also analyzed the spatial patterns of heavy metals in the soil at local scale using Moran’s I for different land-use practices in order to categorize the possible risk areas for heavy metals pollution and their possible causes. The soil contamination indices were also evaluated to quantify the heavy metals accumulation areas. This is very advanced and comprehensive approach to analyze the soil contamination in the complex area of multiple land-use and environmental contamination which has developed a clear understanding of spatial variability of soil contamination by using an applicable and valid evaluation method to achieve soil management at varying spatial scale in typical industrial areas.

Materials and methods

The study area

The soil heavy metals pollution was analyzed in Sheikhupura (31°20′ N–33°05′ N and 73°37′E–74°41′E), Pakistan, and a large number of industrial activities and effluent discharge are subject to this area. Industrial zone of Sheikhupura District is around the major drains of sewage and industries in Muridke and Ferozpura (Fig. 1). There are extreme climatic variations in the district Sheikhupura from winter to summer seasons. The temperature in November to March is cold and foggy during mid of December to February, which may receive light precipitation with intervals. The temperature increases in April, and three successive months of May, June and July are very hot, but monsoon appears from mid of June to August. The average rainfall of the district is 630 mm. The soil of Sheikhupura is mix loam texture, with the Bar in the North Western area which is a level prairie thickly dotted soil known as Missie. The low land along the river Ravi has light loam. The central portion which is the Deg Valley has stiff soil. The stiff soil is either Rohi or Kallarathi depending on the salt contents (Punjab 2016).

Fig. 1
figure 1

Source: Shaheen and Iqbal (2018)

Soil sampling locations and major industries in the study area.

The important characteristics and features of studied area and its pollution hubs included in the analysis of spatial variability are as follows:

  1. 1.

    District Sheikhupura is a very important industrial area in Punjab, Pakistan. The Sheikhupura Industrial Estate has been established since 1969. Punjab Small Industries Corporation (PSIC) is a body corporate enacted in 1973. According to the Directorate of Industries, Punjab, Pakistan (2016), a total number of 748 are installed in this area. These industries are contributing a large amount of waste and effluent contaminating hazardous and toxic elements. The continuous irrigation with industrial effluent and sewage sludge may increase the vulnerability of soil for contamination.

  2. 2.

    The study area is the basin for the different effluent and sewage slug containing drains and Nallah (open cut method), according to Atlas of Punjab Irrigation Department, 2008. There are 220 small and large industries which contribute unfit effluent discharge. In this area total 403.95 cusec effluent is discharged, where 313.45 cusec wastewater is generated by the industrial units and 90.5 cusec sewage is discharged by sewerage system of district municipality in different distributaries and drains (IPD 2008).

  3. 3.

    There is not any arrangement by any industrial unit or municipality to treat wastewater and effluent before its disposal in drains. Some of the drains are cross-district and cross-boundary in the area. This effluent is used for irrigation purpose to cultivate the vegetables in the surveyed area. The study area is along the North West of river Ravi, and remaining effluent drains may discharge their untreated wastewater in the river causing aquatic contamination.

This area is considered good for rice production, vegetables and other agricultural crops, and population is also growing in this area due to its vicinity near the provincial headquarter, Lahore. To date, no such comprehensive studies were carried out in this important area to establish the link of different contamination media at spatial variability scale and to identify its extent. In this area, the soil partition management is recommended based upon the monitoring of heavy metals contamination in soil.

Sampling laboratory analysis of soil samples

The soil samples were collected by confining some criteria in the study area to ensure the spatial variability and standard laboratory procedures were adopted to establish the chemical properties of soil.

  • To implement a representative soil sampling technique all diverse sites of irrigated areas, alluvium, sites near the industrial platform, effluent-irrigated areas and urban soil were included in soil sampling (Li et al. 2013c). For this purpose, the collection of soil samples was employed by a random soil sampling technique.

  • The handheld Garmin global positioning system (GPS) was used to record the spatial coordinates of the sampling locations.

  • The soil samples were taken by using tube auger from 60 sampling sites at three depth profiles of soil (0–15 cm, 15–30 cm and 60–90 cm) each, and total 180 (60 × 3) soil samples were collected (Fig. 1).

  • Three replicates of soil were taken for each soil sample that was composited as a single sample, and a total 1 kg of mixed soil per sample was collected.

  • All collected soil samples were kept in labeled zipped plastic bags and transported to the laboratory for further analysis.

  • The soil samples were kept indoor to dry and, later, crushed and sieved through a fine sieve of 2 mm (Lu et al. 2010). The standard procedures (Klute and Dinauer 1986; Margesin and Schinner 2005) were used to determine the chemical properties of soil samples including EC, pH, P, K+ and saturation. The Tyurin’s method was employed to determine the soil organic matter (OM) (Ali et al. 2015; Carter 1993).

  • The acid digestion method for soil/sediment digestion, followed by USEPA standards 3051A and 3050B, was used to evaluate the total concentrations of heavy metals in the soil samples (Edgell 1989; Gowd et al. 2010).

  • Further, graphite furnace atomic absorption spectrometer (GFAAS) was used to analyze the concentration of selected heavy metals (Cr, Cd and Pb) through means of their respective wavelengths (Kimbrough and Wakakuwa 1989; Tiwari et al. 2015).

The metal concentrations were expressed in units of mg kg−1 on dry soil basis. A geodatabase was developed for the concentration of all physiochemical properties and heavy metals of soil along with their geographical coordinates in ArcGIS environment. Furthermore, other exploratory analyses and geographical analysis techniques using spatial modeling were employed to model the soil contamination.

Statistical analysis

The Minitab 17 statistical software, GeoDa 1.12.159ver and ArcGIS professional were used to perform different geostatistical analyses and spatial modeling on soil data to determine the source of contamination and context of pollution in soil by heavy metals. The graphical statistical summary was determined to analyze the data for its skewness, standard deviation and mean. The heavy metal sources in the different layers of soils were assessed by using multivariate statistical analysis and soil contamination indices. The Moran’s I autocorrelation was identified to examine the associations among the measured heavy metals and spatial characteristics of the area.

Multivariate statistical analysis

The multivariate analysis is a good method which offers two efficient tools, the principal component analysis (PCA) and cluster analysis (CA). These methods, PCA and CA, were used to identify the source of factors and classify the interrelationship between measured soil heavy metals (Cd, Cr and Pb). PCA was computed with varimax rotation method, and CA was developed according to the correlation coefficient average linkage method. The Euclidean distance was employed for measuring the distance between clusters of similar metal contents. The correlation matrix (CM) was computed to support the results obtained by multivariate analysis. The multivariate analysis combined with the advantages of geostatistical techniques on heavy metals in soil may provide comprehensive outcomes.

Global spatial autocorrelation for soil heavy metals

In the spatial autocorrelation, a single dataset is analyzed at a spatial scale. In this method, the correlation of a sampled variable is analyzed in the given area to its spatial location by assessing its attribute value and the location of spatial landscapes. Similarly, the Moran’s I is an autocorrelation statistics technique which measures the spatial relationship. The global Moran’s I autocorrelation analyzes the spatial patterns and identifies the spatial dependency of observed variables. It was estimated for a number of soil sample observations separately for each heavy metal x at i and j locations, Eq. (1).

$$I = n\sum i = 1\,n\sum j = 1nwij \cdot \sum i = 1\,n\sum j = 1\,nwij(xi - \bar{x})(xj - \bar{x})\sum i = 1\,n(xi - \bar{x})^{2} \quad i \ne j$$
(1)

where I is Moran’s I, xi and xj are the observations at locations of I and j, n is the number of observations of the whole region, wij, an element of spatial weights matrix w, is the spatial weight between locations of i and j, and is the mean of x.

The varying weight matrixes were calculated for 4, 5, 6, 7 and 8 weights depicting the association level among an element and its neighboring elements. The contiguity-based weights were assigned on distance or relations. At the first iteration of Moran’s I autocorrelation only four common boundaries neighbors were taken into account, and finally, eight surrounding neighboring cells were included comprising conjoint corners and shared boundaries. Alternatively, for distance-based method, a threshold distance was assigned within the given distance in weight matrix to calculate all surrounding locations. The values of Moran’s I autocorrelation may vary between 1 and − 1. The strong positive values of Moran’s I indicated the clustering of given observation according to its neighbors in a spatial location, while negative values indicated that low and high values in a set of variable were mixed. The Moran’s I values near zero stated no spatial autocorrelation showing random distribution.

Local indicator of spatial autocorrelation (LISA)

The local spatial autocorrelation statistics (Moran’s I) analyzed the pattern of data at a small scale which computed the location of the single point with reference to its neighbors and identified the outliers and clusters. The heavy metals pollution content was identified at the local scale in the diverse land-use location. The LISA was computed according to given Eq. (2).

$$Ii = n(xi - \bar{x})\sum j = 1\,nwij(xj - \bar{x})\sum i = 1\,n(xi - \bar{x})^{2}$$
(2)

The relating values in Eq. (2) are from the local area of a single point to its neighbors, while notations are the same as in Eq. (1). In this geostatistical analysis, the local indicator of spatial autocorrelation (LISA) was analyzed on map-based spatial clustering method. From the four categories of spatial local autocorrelation two suggest outliers and two suggest clusters. The LISA indicated the spatial variability and clustering based upon statistically significant values and predicted the locations which were spatially interesting and identified the location which exhibits spatial heterogeneity.

Geoaccumulation index (I geo)

The spatial variation in the contamination data of soil heavy metals suggests classifying the extent of pollution at computation scale for each observed location. There are different soil contamination and enrichment indices which suggest the level of heavy metals accumulation. However, the geoaccumulation index predominantly determines the soil heavy metals contamination by equating the pre-industrial (uncontaminated background sediments) and current concentration of soil sediments. The comprehensive analysis of soil heavy metals contamination has been done by using the geoaccumulation index. Initially, Igeo was introduced to evaluate metal contamination in sediments and surface soil by a single factor (Muller 1969). It has been indexed for the qualitative assessment of heavy metal contamination in soil (Eq. 3) (Marrugo-Negrete et al. 2017).

$$I_{\text{geo}} = \log_{2} \frac{{C_{n} }}{{1.5(B_{n} )}}$$
(3)

where Cn is a current concentration of selected heavy metals in soil and Bn is background values of natural concentration of sediment (Rahman et al. 2012).

However, the Igeo is a useful index to reduce the factor of human interference to assess the soil contamination in industrial-irrigated areas. The index was calculated from three depths soil data, but there was very less difference among the index values of heavy metals in soil at these depths. So, it was suggested that an accumulated index can be established for mean contamination of heavy metals at the multiple depths of soil. The traditional single factor index (Muller 1969) was evaluated with some adaptations in it, and a comprehensive index by taking the average concentration of heavy metals was computed. This index analyzed the overall average metal accumulation in three subsoil horizons, and MIgeo was calculated by Eq. (4).

$${\text{MI}}_{\text{geo}} = \log_{2} \frac{{(\bar{C}_{n} )}}{{1.5(B_{n} )}}$$
(4)

where \(\bar{C}_{n}\) is mean concentration of soil of any given metal location for three depths of Sheikhupura soils. In this study, the effect of parent rock on heavy metals concentration in soil was reduced by the background values (Bn). The Bn in this index were taken from the standard metal values of WHO (Mekki and Sayadi 2017) and an averages of the elements concentration in soil type of the Pakistan. The anthropogenic impacts and the natural fluctuations of a given metal in soil content were compensated by the constant value of 1.5 (Muller 1969). The proposed six classes for MIgeo are presented in Table 5 which showed the range of contamination index with reference to heavy metals in soil.

Improved Nemerow Index (INI)

The prominent anthropogenic-induced properties and parent rocks effects of soil heavy metal contamination on three depths were evaluated by MIgeo, and this index is appropriate to measure the degree of heavy metal toxicity in the soil for mining gathering areas and industries. However, MIgeo evaluated contaminant for a given single heavy metal at an instant of three depths. A comprehensive and complete pollution status was required in the study area from all selected heavy metals to assess soil pollution risk. The improved Nemerow index was in this study by adopting (MIgeo), the single factor index.

The subsequent formula (Eq. 5) was evaluated for INI (improved Nemerow index):

$${\text{INI}} = \sqrt {\frac{{\left( {{\text{MI}}_{\text{geomax}} } \right)^{2} + \left( {{\text{MI}}_{\text{geoave}} } \right)^{2} }}{2}}$$
(5)

where INI is the index for comprehensive contamination of a soil sample, MIgeoave is the arithmetic mean value of MIgeo, and MIgeomax is the maximum MIgeo value of the sample. The proposed classifications were adjusted for INI (Liang et al. 2016) based upon results to be consistent with MIgeo. The classification scheme for this index is shown in Table 7.

Geostatistical analysis

The geostatistical analysis was evaluated by using spatial autocorrelation and interpolation techniques for soil contamination on the results of contamination indices (MIgeo and INI). The spatial interpolation was applied to predict the surface of the heavy metals (Cr, Cd, Pb) indices in soil separately to explore the distribution pattern. Many studies had been performed on the determination of spatial extent of heavy metals concentrations in soil by different interpolation methods (De la Torre et al. 2012). The regression kriging method was by Martínez-Murillo et al. (2017), and the cross-validation method was used to predict the accuracy of a process for topsoil (Martínez-Murillo et al. 2017). Iqbal et al. also applied the kriging interpolation technique in their research in order to prepare the distribution maps of heavy metals (Iqbal et al. 2005; Tziachris et al. 2017) which were traditional kriging interpolation methods for element concentration.

However, the spatial distribution pattern of the average geoaccumulation index (MIgeo) for each metal (Cr, Cd, Pb) and improved Nemerow index (INI) is carried out by the imperial Bayesian kriging (EBK) method which is a very groundbreaking method in its implementation for index-based soil contamination studies. The EBK is different from classical kriging method which estimates the errors for variogram models. However, in EBK many variogram models were estimated and analyzed and new values were simulated for each given location. Then, new variograms were estimated from the simulated data and weights for each variogram were assigned based on Bayes principle which estimated the likelihood of observed and generated data from the variogram models. The spectrum of variograms was generated by the repetition of this process to simulate a new set of values, and a true semivariogram was computed based upon empirical method to predict the surface for soil contamination by Bayesian kriging (Krivoruchko 2012; Samsonova et al. 2017). The exponential models of semivariogram were considered as best-fitted models for spatial autocorrelation of the samples distribution.

Results and discussion

Principal component analysis and factor analysis of soil parameters

The factor analysis (FA) for heavy metals (Cr, Cd and Pb) and physiochemical properties of soil was extracted by using principal component analysis. The FA is a method to investigate the interrelationship between examined parameters (Chandrasekaran et al. 2015). This technique was very useful to reduce the multidimensionality of various datasets to concise it in interpretable dimensions. The statistical factors for soil datasets were identified by CA that recognized most of the variance and explained the distribution of all elements in the effluent-irrigated soil. The CA was analyzed for three depths of soil data, three factors were analyzed for topsoil and deeper soil, while four factors were derived for subsoil properties based on eigenvalues > 1, which showed that there were negative values at three factors and it represented more variability in the subsoil metals and other soil properties (Table 1). In F1 of topsoil, all three heavy metals (Cr, Cd and Pb) loaded positively (.65, .59 and .36) at 27.2% cumulative variance which showed its dimension from the same anthropogenic source, while pH and saturation loaded negative values in F1. EC and pH and Pb generated strong positive values in F2 (at the cumulative variance of 48.3%) which indicated their similar dimensionality. All three heavy metals showed negative values in F3 (cumulative variance 62.2%), and all other parameters were loaded positive, which designated the pollution source of topsoil by heavy metals contamination. In subsoil profile, the F1 showed that all soil properties loaded positively except saturation (25.7%), and the loading pattern of the subsoil was at four factors due to the change in dimensionality in the downward percolation of elements which affected the pollution source. The variation in subsoil was loaded at four factors (cumulative variance 71.6%). The deeper soil shows the 66.5% cumulative variance at the loading of three factors, and heavy metals have a positive load at F1 and negative values at the F3. The soil properties (OM, P, K, Sat) showed positive loading at the cumulative variance of 50.61% for F2, while heavy metals showed positive loading in F1 for all three soil horizons and were considered major source of contamination.

Table 1 Principal component analysis of soil properties

The distribution pattern of soil properties showed three factors cluster which revealed their relevancy and interrelationship (Fig. 2). The first cluster showed the pH and EC of soil, which was the important physiochemical properties of soil and showed the salinity or alkalinity of the soil, which was induced by the various urban and industrial activities (Chandrasekaran and Ravisankar 2015). The second cluster revealed the heavy metals (Cr, Cd and Pb) group which tends to be major carcinogenic elements in soil from the eminent source of industrial effluents. The third cluster indicated the interconnected group of soil nutrients (OM, K, P) and saturation in the soil, the concentration of these elements was from a natural source in the loamy soil, and the availability of these properties in soil was less affected by the industrial effluents (Qishlaqi and Moore 2007).

Fig. 2
figure 2

The biplot of soil properties in the PCA indicating the dimensionality

Furthermore, the dendrogram analysis was performed on all soil properties, and it was a useful method to explore the variables and their linkages based upon the correlation coefficient method. The maximum similarity of all soil variables was tested for 47.91% (Fig. 3). It was analyzed that Pb and EC were from the same group and similarly extent was near 70%. The OM and P showed a maximum similarity at 88% which indicated the same group and natural soil source. The Cr and Pb showed the similarity of 68% indicating the identical contamination group. However, Cd has similarity at the 52% which showed its variation at spatial scale in the soil samples (Tejeda-Benitez et al. 2016). The K and saturation had a similarity of 65%, showing the same group. These linkages showed that the nutrients in the soil were from the natural source and had not any significant variability by the pollution of accumulation.

Fig. 3
figure 3

The dendrogram analysis on similarity of variables

Statistical analysis of heavy metals concentration in soil

The concentrations of selected heavy metals are reported in Fig. 4a–c for each depth of soil. The skewness was tested between 1 and − 1, and all three heavy metals show symmetrical normal distribution; however, Cr and Cd in a deeper depth of soil showed some skewness at 0.70 and 0.89 (Fig. 4d) and Cd and Pb showed smaller skewed values of 0.60 and 0.72 at subsoil (Fig. 4b). But it was less than 1 and − 1 and was considered as normal distribution. The total concentration of each metal varied in soil samples at subsoil profiles and spatial scale, while samples were collected from effluent-irrigated or non-effluent applied areas (Soodan et al. 2014). The order of metal enrichment was (Cr) > (Pb) > (Cd), and concentration values decreased from top to subsoil successively for all three metals Pb, Cr and Cd (mean ranges 14.1–5.8 ppm, 26–10.4 ppm and 6–4 µg ml−2) individually. The variability of metals bioavailability and their accumulation was analyzed in dendrogram by signifying the mutual metal to metal correlation (Hasan et al. 2017). The heavy metals may percolate and leach downward in subsoil horizons subject to variation in hydrogen ions, water, pH and EC (Leung et al. 2017).

Fig. 4
figure 4

Summary statistics by using Anderson–Darling normality test on heavy metals (Cr, Cd, Pb) for three layers of soil, a topsoil, b subsoil and c deeper soil

The environmental protection agency (EPA) permissible limits for Cd are 0.43–70 ppm. The rate of annual accumulation of Cd in soil may cause alarming concern for at different locations. A range of chronic health problems in human is caused as a consequent of cadmium intake. This metal is considered in a small group of metals with high toxicity for which a very provisional standards limit for daily intake by humans is set by the FAO/WHO (Chunhabundit 2016). In the legislation defined by European Union also demarcated the maximum permissible concentration (MPC) of Cd was in a range of edibles and foodstuffs (Alves 2017). The Cr is very carcinogenic trace metal, and it is very volatile by its difference valences (Su et al. 2016). The Cr concentration showed an increasing trend from the upper soil to the subsoil, and this metal has high mobility in the food chain and human organs (Friehs et al. 2016). There is a range of observations which were according to the given standard control limits for Pb (Fig. 4). Lead is considered as poising metal causing public health hazard and one of the global environmental problems, especially in children of young age (Arnemo et al. 2016), and in many countries, there is prohibition of Pb additives in paint and petrol. In soils, the availability of Pb is less labile than Cd, and even at high concentrations, uptake by plants is usually slight in amount (Kede et al. 2014). Therefore, it was needed to identify the spatial patterns of heavy metals distribution in the study area.

Spatial autocorrelation of heavy metals in soil

Soil samples were stored and arranged in a geodatabase for geospatial modeling and analysis. The spatial autocorrelation analysis of soil heavy metals was performed using GeoDa software including global and local Moran’s I.

Weight matrix influence on spatial dependency (global Moran’s I)

A neighborhood structure was enforced for given samples by using different spatial weight matrixes of eight classes to assess the spatial dependency of three heavy metals and varied distance bands. The values of global Moran’s I for given distances for each metal are shown in Table 2. The Moran’s I was established at a confidence level of 95% with significance values (p < 0.05). All eight spatial weights loaded significant and positive spatial correlations of heavy metals (Cr, Cd and Pb) except Cr which had low (closer to 0) Moran’s I values at distance band. There were low significant spatial correlations for Cr at four neighbor’s spatial weight. The Cd and Pb had low Moran’s I at distance band weight. However, the significant spatial correlation was found for all on connectivity histogram and other types of weights. It was indicated that direction of eight neighbors or four neighbors had not affected the direct relations of neighboring locations. The increasing number (k) of nearest neighbor points usually decreased the coefficients of spatial autocorrelation, following the rule of the less attribute similarity with the remoter distance (Table 2). Relatively, higher Moran’s I values were found for all metals with the distance band of 4 km in spatial weight matrix. However, the distance-based weight matrix was the rational weight matrix for irregular samples (Huo et al. 2011). Therefore, distance-based weight matrix was calculated for subsequent spatial autocorrelation analysis of heavy metals (Cr, Cd and Pb) which showed positive and significant spatial autocorrelation.

Table 2 Moran’s I spatial autocorrelation for heavy metals in soil by using weight influence

An empirical method was used for the selection of spatial weight, as well as under a certain distance limit, the same weight matrix was assigned to all points of given locations, so spatial autocorrelation results for heavy metals had a certain impact of these weights (Li et al. 2013b). The results of spatial weights showed reasonable influence on the spatial autocorrelation of selected heavy metals, if these spatial weights were designed based on decay distance.

The Moran’s I scatter plots (Fig. 5) showed spatial interrelationship for all samples of Cr, Cd and Pb, where horizontal axis showed standardized values of the neighboring heavy metals concentration and vertical axis depicted standardized lagged concentration values of given heavy metals. A significant samples portion of these three metals was generally clustering in the upper corners and along the line, indicating the positive spatial autocorrelation (Huo et al. 2012), and showed the complete spatial pattern. A certain part of the samples was also representing the negative spatial autocorrelation in the lower right and upper left quadrants, which was considered to be ignored. The scatter plot of autocorrelation tended to developed more disaggregated with the decrease in coefficients of spatial autocorrelation, and some samples strongly influenced the global spatial autocorrelation that were at far distance from Moran’s I regression line (Golden et al. 2015), mainly for Pb and Cr, representing some local non-stationarity in samples points. Therefore, the variability in all samples’ spatial patterns should be deliberated.

Fig. 5
figure 5

The Moran’s I pattern at the weight influence of four neighbors for the distribution of Cr, Cd and Pb in soil

LISA for local spatial variability and clustering

However, the three heavy metals (Cr, Cd and Pb) had a significant positive spatial global autocorrelation. But, the LISA is a very interesting spatial analysis which identified the local patterns indicating the spatial variability or clustering of samples (Li et al. 2014). It was observed that there was not any significant spatial pattern for more than half of the samples in these metals (Table 3). The significant spatial clusters were identified for 25% Cr, about 58.3% Cd and 33.3% Pb samples, showing the largest spatial pattern of heavy metals. In the soil heavy metals, high–high patterns were more than half than the low–low pattern. However, the low–low pattern of the three heavy metals dominated the overall spatial pattern of the study area. Almost 5–9% heavy metals samples had significant spatial outliers, indicating an overwhelming low–high pattern for Cr, Cd and Pb. These three heavy metals indicated significant spatial patterns which designated strong ongoing enrichment processes (Yan et al. 2015) of Cr, Cd and Pb in the soils of Sheikhupura irrigated by industrial effluent.

Table 3 The spatial distribution of heavy metals pattern (%) in soil

Further, the interesting trend of heavy metals was detected by LISA maps showing their spatial patterns (Li et al. 2014). The northeast (Fig. 6a) region for Pb, southwest for Cr (Fig. 6b) and southeast portion for Cd (Fig. 6c) were strongly influenced by high–high pattern, these were the zones where industrial clusters were found and some nearby areas also showed the effluent-irrigated areas by the industrial drains (Fig. 1) in the Sheikhupura. Furthermore, it was observed that anthropogenic activities had begun to alter the low–low spatial pattern (Tang et al. 2013) so high–low outliers of the three metals were mainly distributed in the western region, which were found near low–low spatial clusters. The long-term wastewater irrigation and industrial-irrigated history have been experienced in Sheikhupura areas of central and northeast and southeast region of the district with the bed drain, Deg Nallah and Niki Deg, which led to heavy metal contamination.

Fig. 6
figure 6

The LISA cluster maps of heavy metals (a Pb, b Cr, c Cd) in soil

However, the LISA map was modeled based on the soil heavy metal samples, so it was essential to quantify the extent of heavy metals contamination boundaries between different spatial patterns. Furthermore, there was a need to develop an understanding of pollution zoning for the three heavy metals to map the spatial distribution. The dispersals and spatial extent pattern of soil heavy metals could be used to demarcate the potential remediation and monitoring zones. Additionally, the soil contamination indices were developed to elaborate the contamination extent in the area under study.

Heavy metals contamination indices

The detailed results of Igeo and MIgeo are shown in Tables 4 and 5, while the contamination classification schemes for these soil contamination indices are presented in Tables 6 and 7. It was found that mean Igeo and MIgeo values for Pb were less than prescribed class 1 and identified as very less contaminated class, and the maximum values of topsoil in Igeo and MIgeo values exceeded slightly from class 1. It was found that major sources of lead pollution in soil were inorganic metals containing chemicals induced by general anthropogenic activities like agriculture, traffic and industries (Shakir et al. 2012). The regular irrigation of soil by industrial effluents adversely affected the soil quality (Ma et al. 2015); this situation increased metal bioaccumulation which led to infertility of the soil. Furthermore, the fertile soil may change into contaminated and saturated alkaline soil due to increasing amount of pollutants (Bolan et al. 2014).

Table 4 Results of Igeo index
Table 5 Improved Nemerow index (In)
Table 6 Classes of Igeo index
Table 7 Classes of improved Nemerow index

The concentration of chromium was found high around the loamy drains of tannery waste and adjacent irrigated areas of industrial effluent. There was also a high concentration of Cd and Cr in the general contamination index (Table 4), along the roadsides and in the vicinity of the main drain. The pollution source of Cr and Cd was from tannery industry and other chemicals used in industrial processes like sodium sulfite, sodium sulfide, ammonium chloride, ammonium chloride, hydrogen peroxide, chromate, aldehyde tanning agents, sodium bicarbonate, chloride and fungicide, and these were extensively used in a variety of processes which are very toxic in nature (Mebrahtu and Zerabruk 2011). In many studies, a significant positive correlation was found between the heavy metals in soil and industrial effluent which indicated that the transfer of heavy metals and source of pollution in soil had considerable relation with industrial effluents (Adewuyi et al. 2014).

Spatial distribution of contamination in soil

The spatial extent of Igeo for three major metals (Cd, Cr and Pb) in the soil was also demonstrated with interpolation maps in Figs. 7, 8 and 9. The interpolation maps revealed the spatial distribution of heavy metals (Cd, Cr and Pb) contamination and their indexed-based classification in the areas irrigated by effluent and drains in the vicinity of the industrial center (Shaheen and Iqbal 2018). The pollution rate in different classes of indices established the comprehensive depiction of contamination in the area (Santos-Francés et al. 2017a). The maps were developed by evaluating the geostatistical interpolation methods (EBK) for best-fitted semivariogram γ(h) by simulating the iterations for the exponential empirical model, and the nugget (error factor) values of the given models were very low for the selected variogram models (0.26 for Pb, 0.41 for Cd and 0.48 Cd). The low nugget value in the spatial interpolation model gives a significant reliability factor of spatial variability (Turner et al. 2000).

Fig. 7
figure 7

Spatial distribution of MIgeo of Pb in soil

Fig. 8
figure 8

Spatial distribution of MIgeo of Cr in soil

Fig. 9
figure 9

Spatial distribution of MIgeo of Cd in soil

The spatial extent of the geoaccumulation index showed that major pollution content of all three heavy metals was around the Nalas irrigated by the different drains of industrial effluents, i.e., the Wake Nala, Deg Nala, Mandal Nalah, Bed Nala and Ferozpura Drain (Figs. 7, 8, 9). Figure 7 shows the Pb extent in the (Igeo) which indicated that north and southeastern part had 1 < Igeo ≤ 2 class which was moderately contaminated, while in Fig. 8 Cr (Igeo) showed heavily to moderately contaminated (4 < Igeo ≤ 5) area in the northwest part, this area was under the effluent irrigation practices and major leather and tannery work was processed in this area. The Cd (Igeo) was indicated as (Fig. 9) the heavily contaminated in the index (3 < Igeo ≤ 4) around the central and slightly southwestern part, which indicated its downstream leaching in this area. The improved Nemerow index (IN) spatial extent map (Fig. 9) showed the overall contamination in the industrial zone of Sheikhupura from all three heavy metals, and the major contamination was observed near the Deg Nallah irrigation area and its drainage point near river Ravi. The soil heavy metal interactions fluctuate greatly due to complex soil properties and heterogeneity of various morphological features resulting from the availability of heavy metals, water and pedogenetic processes (Alonso et al. 2016).

The comparison of related studies

The aim of this article was to identify the effective and comprehensive model to monitor the soil contamination at the spatial scale for better understanding to eliminate the soil pollution. The laboratory analytical methods and spatial analyses were compared with previous studies (Table 8). However, some adaptation and modification always need to fulfill the gap and for comprehensive modeling of contamination in the soil and environmental processes (Fig. 10).

Table 8 Comparison of different analytical features with previous work
Fig. 10
figure 10

Spatial distribution of INI of heavy metals pollution in soil

Conclusion

The present study revealed the extensive effects of the industrial effluent on the soil contamination levels. The toxic heavy metals have enriched the soil by irrigation practices of effluent which were ultimately transported to the food chain, plant and vegetables. The PCA and FA identified the source of contamination as heavy metals and their clustering, while global and local Moran’s I identified the distribution pattern of heavy metals by using four to eight weight and spatial autocorrelation showed that there was strong positive spatial autocorrelation. The LISA maps showed that eastern and southeastern part was affected by high-to-high clustering for all three metals which suggested the potential areas for contamination hazard. Furthermore, based upon the spatial cluster, the soil pollution indices were calculated to quantify the extent of contamination. The heavy metals contamination was observed at the scale of < 0 to > 5 in six different contamination classes. The vegetables and soil irrigated by the effluents accumulate elevated levels of heavy metals which induced environmental health problems and reduced the soil fertility. Although the vegetables cultivated in the polluted soil may contain extensively high levels of heavy metals (Cd, Pb, Cr), apparently the farmers were ignorant of the environmental contamination and possible health hazard to the local population. Since this study was on assessment of some carcinogenic heavy metals, further investigations of heavy metals and their effect on different food plants and human health can also be analyzed. The further application about the adaptive geospatial soil contamination modeling can be point and non-point soil contamination monitoring, modeling and simulation of contamination in different environmental paths. The assessment of these areas can assist to develop policies and measures and can be responsive to evaluate the pollution processes and spatial variations.