Introduction

Water is the elixir of life and is among the most precious resources. Water makes up over 70% of the body weight of practically all living things. Without water, life on this planet is impossible. About 97.2% of the water on the planet is saline, with only 2.8% being fresh water, of which about 20% is groundwater. Because it has qualities that surface water does not, groundwater is held in high esteem [1]. Groundwater is the most significant supply of drinking water in the country, and it is essential for the country’s development. It also helps people meet their numerous needs. The unrestricted use of groundwater, urbanization, industrialization, and agricultural activities all result in a massive amount of contaminated water [2]. Also inappropriate use of chemical fertilizers, particularly nitrogen fertilizers, to boost crop yields, as well as the transport of urban and industrial effluent are believed to be factors that raise nitrate concentrations in groundwater [3]. The purity of replenished water, atmospheric showers, inland surface water, and underground geochemical activities all have an effect on groundwater quality [4]. However, the nature and quantity of contaminants is determined by the geology of the river’s course and the quality of the water it supplies [5].

Groundwater quality (GWQ) assessment is of utmost importance due to the fact that consumption of polluted water is detrimental to people’s health, corporate growth, and societal welfare [4]. The quality of groundwater is determined using the water quality index (WQI). Horton [6] was the first to develop the WQI on the basis of a weighted arithmetic approach. The WQI is a simple and effective approach for determining water quality [7]. It is a one-dimensional number that ranges from 0 to 100. It is a numeric rating system that shows the quality status of water (excellent, good, bad, etc.) at a specific location based on a variety of water quality factors. As a result, the WQI is being used as a crucial tool for comparing groundwater quality [8]. Many researchers have conducted studies over the last two decades using WQI to assess groundwater quality [9,10,11]. In the past, various indexing methods, such as Prati’s Index of Pollution, Bhargava’s Index, Oregon WQI, Dinius’ Second Index, Weighted Arithmetic Water Quality Index (WA WQI), Canadian Council of Ministers of the Environment Water Quality Index (CCME WQI), and National Sanitation Foundation (NSF), were adopted for the evaluation of groundwater quality [12]. However, in this study, based on the physico-chemical parameters available, four different indexing methods were adopted to determine the groundwater quality of the study area, viz., Numerow’s pollution index (NPI), WA WQI, Groundwater Quality Index (GWQI), and CCME WQI.

In recent years, GIS technique has been used to monitor and evaluate groundwater quality frequently [13]. Groundwater assessment has traditionally relied on laboratory testing, but GIS have made it much easier to connect multiple databases [14]. The IDW interpolation method, along with the (GIS) technology, has been shown to be an effective approach for interpreting and analyzing spatial information of groundwater. It is an economical and time-saving method for converting large data sets into different spatial distribution maps and projections that indicate patterns, correlations, and sources of pollutants [15]. In this study, the spatial evaluation of all eight groundwater quality parameters was done with the help of the GIS technique. Also, spatial distribution maps based on different WQIs were prepared. Several studies have been conducted to assess groundwater quality using WQI within the context of a GIS framework [13, 15,16,17].

Multivariate statistical analysis (MSA) is an efficient tool for analyzing the properties of physico-chemical parameters in groundwater and determining the relationship between them [18]. These techniques could be used to find correlations between parameters and sample locations, highlight relevant variables and sources that influence groundwater quality, and offer effective tools for both water resource management and groundwater quality monitoring [19]. In this study, MSA such as correlation analysis, principal component analysis (PCA), and cluster analysis (CA) were adopted to elucidate the relationships between water quality variables and possible factors, as well as their effects on water quality. Sadat-Noori et al. [20] used correlation analysis to determine the correlation coefficient (r), which depicts the correlation between variables. PCA is a useful tool for elucidating massive data sets in complicated forms and reducing distortion in processes. It also encourages us to be aware of potential pollution sources or variables that affect water quality [21]. CA is used to examine the spatial groupings of the sampling locations. It is a widely used approach for grouping variables into clusters [22]. Various researchers have used this concept.

In the past, GWQ assessment was done with the help of different water quality index. Hamlat et al. [23] have adopted 10 such WQIs to evaluate the water quality of the Tafna basin. The results from the study reveal that CCME WQI and BC WQI were the best indices to describe the water quality of the basin. Also, numerous studies have been carried out using MSA to determine the GWQ. Acikel et al. [24] employed various MSA techniques such as FA, CA, and correlation analysis to determine the quality of water in the Azmak spring zone, Turkey. The study reveals that MSA is an important technique for describing the groundwater flow mechanism. However, with the recent advancement in technology such as GIS, much work has been done under the framework of GIS to evaluate the water quality. Ram et al. [25] used GIS and WQI for the assessment of groundwater quality in Mahoba district, Uttar Pradesh, India. The study concluded, with the help of WQI map, that the overall quality of water in the area is suitable for drinking. Recently, many combined approaches have been adopted for evaluating the GWQ. Roy et al. [26] suggested combined application of WQI and MSA for evaluation of GWQ West Tripura, India.

In the present study area, groundwater is the primary source of drinking water. However, minimal work has been performed on GWQ assessment of Ujjain City [e.g., 27, 28]; not even a single research has been done with an integration of WQI, GIS, and MSA to assess the GWQ of this region. Thus, there is a research gap here, and more discussion is needed to have a better understanding of the extent and causes of GWQ degradation. In light of these considerations, the city of Ujjain in Madhya Pradesh, India, has been chosen for the purpose of an exhaustive study using the integrated approaches of different water quality indexes with GIS and multivariate statistical analysis. The purpose of this study was to accomplish the aforementioned objectives: (1) to use different indexing methods and GIS techniques to analyze the groundwater quality for its suitability for drinking as per BIS 10500:2012; (2) to categorize all 54 samples as excellent, good, or poor based on ratings from various indexing methods such as NPI, WA WQI, GWQI, and CCME WQI; (3) to develop thematic maps for individual physicochemical parameters and also WQI maps based on indexing methods using GIS; (4) to evaluate the disparities and clarification of a huge and complicated GWQ data set using MSA techniques such as correlation analysis, PCA, and CA; and (5) to find the wards that are most common in all indexing methods that are unfit for the consumption of drinking water.

Study Area

Ujjain is a historic city in Madhya Pradesh, India, that sits on the banks of the Shipra River. During ancient times, the city was known as Ujjayini. It is one of the most populous cities in Madhya Pradesh. It serves as the administrative hub of the Ujjain district. The Ujjain Municipal Corporation comprises a total of 54 wards. The latitude and longitude of the city are 23°10′58″N and 75°46′38″E, respectively. The city of Ujjain covers a total area of 93 km2. According to Census India’s provisional reports, the population of Ujjain City was 515,215 (5.15 lakhs) in 2011, and the forecasted population of Ujjain in 2022 is 5.70 lakhs, with a population growth rate of 10% over the decade. The city of Ujjain is segmented into pedeplains (shallow, deep, and moderate), residual hills, valley fills, flood plains, and other geomorphological features, a few of which have good groundwater potential, such as pedeplains (deep) and valley fills [28]. It has a pleasant monsoon climate. However, winter begins in mid-November and is comforting and cool, with a daytime temperature of 20 °C, while the night-time temperature can drop dramatically. The annual rainfall of Ujjain city is 892.9 mm. On average, the elevation is 491 m. The temperature ranges between 8 and 40 °C.

Materials and Methodology

Sampling and Analysis

In this study, eight physicochemical properties such as pH, turbidity, electrical conductivity (EC), total dissolved solids (TDS), alkalinity, hardness, chloride (Cl), and fluoride (F) were selected from 54 groundwater samples that were collected from dug wells, bore wells, and hand pumps, which were assessed and compared with BIS 10500:2012 for drinking purposes. The samples were collected at distances with reference to other locations to provide a broad investigation of the study area’s water quality. The samples were gathered in clean and dry plastic bottles from different sources after draining the water for a few minutes. All 54 samples were examined for the eight parameters using the procedures outlined by the American Public Health Association (APHA 2017). Table 1 summarizes the methodology, which comprises the analytical techniques, software, and instruments used to complete the work. The overall evaluation of GWQ was done using four different indexing methods, such as NPI, WA WQI, GWQI, and CCME WQI. Moreover, ArcGIS 10.8 was used to prepare the digitized base map of the study area as shown in Fig. 1. Using the spatial analyst tool from the tool box, Inverse Distance Weighted (IDW) technique was selected for preparing interpolated maps. The MSA techniques such as correlation analysis, PCA, and CA were executed using Minitab statistical software.

Table 1 Analytical techniques adopted for analysis
Fig. 1
figure 1

Index map of study area

Methodology

The methodological flowchart illustrating the details of various steps involved in evaluating the GWQ is shown in Fig. 2. The methodological details for GIS analysis and mapping of groundwater parameters, GWQ modeling, and GWQ analysis have been presented in the flowchart.

Fig. 2
figure 2

Methodological flowchart

GIS Analysis and Mapping of Groundwater Parameters

GIS aids in the interpolation of various experimental data in order to create thematic and spatial maps. It allows for the statistical development of a relationship in order to summarize the GWQ of the area in a simplified visual form. The most commonly used and acknowledged methods for generating spatial distribution maps are Inverse Distance Weighted (IDW), kriging, and cokriging. However, in this study, the (IDW) interpolation method in ArcGIS-10.8 software has been utilized to create spatial distribution maps of all parameters. The IDW interpolation method calculates undetermined values in relation to a distance, with the closest point receiving more weightage and decreasing as the distance increases. Furthermore, numerous researchers [25, 29] had employed this technique to create spatial distribution maps for different parameters. Table 2 shows statistical analysis of analyzed physicochemical properties. Table 3 shows NPI calculation for ward 1. Table 4 shows classification of 54 groundwater samples for drinking based on WQI values obtained using different indexing methods.

Table 2 Statistical analysis of analyzed physicochemical properties
Table 3 Calculation of NPI for Ward No. 1
Table 4 Classification of 54 groundwater samples for drinking based on WQI values obtained using different indexing methods [30,31,32,33]

Groundwater Quality Modeling

In this study, the GWQ of all 54 samples were modeled using four indexing methods: NPI, WA WQI, GWQI, and CCME WQI [30,31,32,33]. All these methods are discussed in detail in the following sections.

NPI

The NPI was created to measure the impact of each individual particle and is used to calculate the total harm caused by pollution. It is an overall pollutant indicator that takes into account the combined influence of several pollutants for a given application. For the establishment of an index for any given function, such as drinking and irrigation, the index value includes relevant particles [30]. Based on NPI the water quality is classified into five classes [Excellent (< 10), Good (10–20), Poor (20–30), Very Poor (30–40), Unsuitable (> 40)]. Gauns et al. [34] have adopted this approach to determine the quality of drinking water. NPI is calculated with the help of the following equation:

$${\mathrm{PI}}_{n}={\mathrm{C}}_{n}/{\mathrm{S}}_{n}$$
(1)
$$\mathrm{NPI}=\sum {\mathrm{PI}}_{n}$$
(2)

where PIn is the nth parameter’s pollution index, Cn is the nth parameter’s observed value, Sn is the nth parameter’s permissible value, and NPI is the Nemerow’s pollution index.

WA WQI

WA WQI is used to determine the quality of water for drinking by using the selected physicochemical parameters. In this study, eight parameters were taken into consideration to compute the WA WQI. The quantitative assessment of GWQ using WA WQI was carried out using Brown’s method [31]. Based on WA WQI the water quality is classified into five classes [Excellent (0–25), Good (26–50), Poor (51–75), Very Poor (76–100), Unsuitable (> 100)]. This approach has been widely adopted in the past by many researchers [35, 36]. The following steps were used to calculate the WA WQI:

  1. 1.

    Unit weight (Wn): To calculate Wn, a quantity which is inversely proportional to Sn of the suitable parameter was utilized. The Wn of each parameter is given in Table 5.

    $${\mathrm{W}}_{n}=\mathrm{K}/{\mathrm{S}}_{n}$$
    (3)

    where Wn is the nth parameter’s unit weight, Sn is the nth parameter’s standard value, and K is the proportionality constant.

  2. 2.

    Subindex (qn): Subindex is calculated by the following equation:

    $${\mathrm q}_n=\frac{\left(V_n-V_o\right)}{\left(S_n-V_o\right)}\times100$$
    (4)

    where Vn is the mean value of nth parameter, Sn is the standard value of the nth parameter, and Vo is the actual value of parameter.

  3. 3.

    By linearly combining the qn and Wn, the overall water quality index is calculated.

    $$\mathrm{WQI}=\frac{\Sigma{\mathrm q}_n}{\Sigma{\mathrm W}_n}$$
    (5)
Table 5 Unit weights assigned to each parameter

GWQI

This method of determining the WQI is easy and trustworthy. GWQI is among the most extensively employed index to assess GWQ for drinking purposes throughout the world [33]. GWQI classify the water quality into five classes [Excellent (< 50), Good (50–100), Poor (100–200), Very Poor (200–300), Unsuitable (> 300)]. It is one of the most widely adopted approach; Agarwal et al. [37] have employed GWQI for evaluating quality of water. The following five steps were carried out to find GWQI:

  1. 1.

    Assigning weightage (wi): To calculate the GWQI, eight parameters were selected: pH, EC, TH, TDS, alkalinity, Cl, F, and turbidity. As stated in Table 5, parameters were assigned a weightage (wi) on a scale of 1 to 5 depending on their relative significance to GWQ.

  2. 2.

    Calculation of relative weights (Wi): The following expression was used to calculate Wi for each parameter. The Wi of each parameter is given in Tables 6, 7 and 8 as Tables illustrating/showing calculation of WA WQI, GWQI, and CCME WQI for ward 1 respectively.

    $$W_i=\frac{{\mathrm w}_i}{\sum_{i=1}^n{\mathrm w}_i}$$
    (6)
  3. 3.

    Calculating quality rating scale (Qi): It represents the percentage of the parameter’s actual value to its standard value.

    $$Q_i=\frac{C_i}{S_i}\times100$$
    (7)

    where Ci is the ith parameter’s actual value and Si is the ith parameter’s standard value.

  4. 4.

    Subindex (SIi): It is calculated for an individual parameter and is given by the following equation:

    $${\mathrm{SI}}_{\mathrm{i}}={\mathrm{W}}_{\mathrm{i}}\times {\mathrm{Q}}_{\mathrm{i}}$$
    (8)
  5. 5.

    Calculation of GWQI: By combining all of the subindices for each parameter, an overall groundwater quality index was calculated.

    $$\mathrm{GWQI}=\sum {\mathrm{SI}}_{\mathrm{i}}$$
    (9)
Table 6 Calculation of WA WQI for Ward No. 1
Table 7 Calculation of GWQI for Ward No. 1
Table 8 Calculation of CCME WQI for Ward No. 1

The CCME WQI

The CCME WQI is centered upon several water uses, such as drinking, leisure, agriculture, animals, and sea species. The CCME WQI is a standardized approach for assessing water quality that was developed by Canadian authorities. A committee constituted inside the CCME designed the WQI. The index value is graded on a scale of 0 to 100, with 0 being the lowest and 100 representing the finest water quality. CCME WQI divides the quality of water into five classes [Excellent (95–100), Good (80–94), Fair (65–79), Marginal (45–64), Poor (0–44)]. Wagh et al. [38] have adopted CCME WQI to determine the quality status of water. The CCME WQI mathematical formula is shown below [32].

$$\mathrm{CCMEWQI}=100-\left[\frac{\sqrt{{F1}^{2}+ {F2}^{2}+{F3}^{2}}}{1.732}\right]$$
(10)

where F1 is the number of variables with unmet aims (failed variables), F2 is the fraction of individual tests with failed tests, and F3 is the percentage of failed test values with unmet objectives.

  1. 1.

    Calculating scope value (F1):where

    $$F_1=\frac{\mathrm{No}.\:\mathrm o\mathrm f\:\mathrm f\mathrm a\mathrm i\mathrm l\mathrm e\mathrm d\:\mathrm v\mathrm a\mathrm r\mathrm i\mathrm a\mathrm b\mathrm l\mathrm e\mathrm s\:}{\mathrm{No}.\:\mathrm o\mathrm f\:\mathrm v\mathrm a\mathrm r\mathrm i\mathrm a\mathrm b\mathrm l\mathrm e\mathrm s\:}\times100$$
    (11)
  2. 2.

    Calculating frequency value (F2):where

    $$F_2=\frac{\mathrm{No}.\;\mathrm{of}\;\mathrm{failed}\;\mathrm{test}}{\mathrm{No}.\;\mathrm{of}\;\mathrm{test}}\times100$$
    (12)
  3. 3.

    Calculating amplitude value (F3):where

    $$F_3=\frac{\mathrm{nse}}{0.01\;\mathrm{nse}+0.01}$$
    (13)
$$\mathrm{nse}=\frac{\sum_{i=1}^n\mathrm{excursion}}{\mathrm{Total}\;\mathrm{no}.\;\mathrm{of}\;\mathrm{test}}$$
(14)
$$\mathrm{Excursion}=\frac{\mathrm{Failed}\;\mathrm{test}\;\mathrm{value}}{\mathrm{Objective}}$$
(15)

GWQ Analysis Using MSA

MSA

Simeonov et al. [39] suggest that MSA is the most effective method for avoiding misinterpretation of large amounts of complex pollution monitoring data. Correlation analysis, PCA, and CA were performed to evaluate spatial variability and discover pollution sources. In this study, these methods were used on 54 samples for eight variables. To eliminate misclassification due to dimensionality differences, all 8 variables were standardized by computing their standard scores (z-scores) as follows:

$${Z}_{i}=\frac{{X}_{i}-X}{\mathrm{SD}}$$
(16)

where Zi is the standard score of ith variables, Xi is the actual value of ith variable, \(X\) is the mean value of variable, and \(\mathrm{SD}\) is the standard deviation of variable [40].

Correlation Analysis

Correlation analysis is a popular and helpful statistical technique for evaluating the strength of a relationship between two variables. In this study, the correlation coefficients (r) of the variables were employed to determine the correlation between them. The correlation coefficients of all the 8 variables were determined using Karl Pearson’s correlation matrix and are presented in Table 9. Its value (r) can be positive or negative, ranging from − 1 to 1 [41]. A few correlation coefficients are positive, expressing similarity in the same direction, and a few are negative, expressing dissimilarity, as seen in Table 10, where + 1 denotes a perfect positive relationship and 0 shows no relationship between the correlated variables.

Table 9 Karl Pearson’s correlation matrix between the parameters
Table 10 Summary of parameters based on the BIS 10500:2012 permissible limit [34, 53]

PCA

PCA describes the variation of a vast data set of variables by compressing them into a reduced data set of independent variables [42]. PCA decreases the dimensionality of data by creating new hidden variables that are perpendicular and uncorrelated to each other by combining original data in a linear manner [43]. From the covariance matrix of given variables, it derives the eigenvalues and eigenvectors [44]. The PCs’ eigenvalues represent their associated variance, whereas the loadings represent the given variable contributions to the PCs [45]. The correlation matrix determines how well each constituent’s variance may be described by their relation with one another [46].

CA

CA is a renowned classification tool that aims to determine either the distance or similarity between the variables to be grouped. It is commonly shown with the help of a dendrogram, which is a two-dimensional graph that displays a clear pictorial description of the process [47]. The distance between the parameters of samples is studied using hierarchical cluster analysis (HCA). The points that are the most similar are joined together to form a cluster, and this is continued till all the points fit into the same cluster [48]. In this study, Ward’s method, using squared Euclidean distances, is employed.

Results

Groundwater Quality Parameters

As shown in Fig. 3a–h, groundwater quality mapping was implemented using IDW in ArcGIS 10.8 for the eight physicochemical parameters. The several parameters taken into account in the study are explained in the following lines:

Fig. 3
figure 3

Spatial distribution map of water quality parameters. (a) pH. (b) EC. (c) Hardness. (d) TDS. (e) Alkalinity. (f) Chloride. (g) Fluoride. (h) Turbidity

pH

The amount of hydrogen ions in water is indicated by the pH, which is neutral. It represents the acidic or basic nature of water. For drinking purposes, the pH must be between 6.5 and 8.5 (BIS, 2012). The pH of pure water is neutral, indicating that hydrogen ions are present. However, the pH in the collected groundwater sample varies from 6.8 (minimum) to 8.2 (maximum), indicating that it is well within the permissible range (6.5 to 8.5) as shown in Fig. 3a.

EC

The amount of dissolved material in a medium determines the EC; the more dissolved material in a medium, the higher the EC. High EC values can be caused by a variety of geochemical processes, such as reverse and direct ion exchange, significant evaporation, silicate degradation, and rock–water interaction [49]. However, for drinking purposes, EC should not exceed 300 µmho/cm. The EC in this study ranges from 8673 to 195 µmho/cm as shown in Fig. 3b.

Hardness

It refers to the quantity of calcium and magnesium ion dissolved in the water. The total hardness of water is a vital parameter in the household sector. It occurs as a result of the existence of calcium and magnesium in the body. The maximum hardness that can be tolerated is 300 mg/l. Range of hardness is from 148 to 990 mg/l as shown in Fig. 3c.

TDS

The majority of TDS is made up of inorganic salts and a little quantity of organic molecules dissolved in water. High TDS levels in water can alter the taste and hardness of the water. On the other hand, water with exceptionally low TDS has a bland taste [8]. The acceptable TDS for drinking water, according to the BIS, is less than 500 mg/l. It lies in the range of 125 to 5604 mg/l as shown in Fig. 3d.

Alkalinity

It is caused by the presence of carbonate, bicarbonate, and hydroxide ions in the water. Water has a better capacity to neutralize acids when it has a higher alkalinity, and vice versa. Alkalinity and bicarbonate are associated in neutral water [50]. It should not exceed 200 mg/l. The taste of water becomes harsher further than this point. For the study area, alkalinity ranges from 200 to 532 mg/l which can be seen in Fig. 3e.

Chloride

The higher the chloride concentration in water, the more dangerous it is to human health. The associated cation influences the taste threshold of the chloride ion in water. Geogenic or anthropogenic processes seems to be to responsible for the increasing Cl concentrations in groundwater [51]. Chloride levels should not exceed 250 mg/l. The amount of chloride in this study varies from 38 to 1320 mg/l as shown in Fig. 3f.

Fluoride

It is primarily found in water as a result of geological processes. Fluoride in high concentrations (> 1.0 mg/l) can cause skeletal fluorosis. The concentration of fluoride in the study ranges from 0.4 to 1.7 mg/l, as shown in Fig. 3g.

Turbidity

It describes the foggy appearance of water caused by particles, often known as “suspended matter.” Drinkable water that is turbid loses its aesthetic appeal. Turbidity can have a variety of appearances and colors [52]. For drinking water, the maximum turbidity allowed is 5 NTU. The turbidity in this study area ranges from 0.1 to 5.9 NTU as shown in Fig. 3h.

Integrated Approach of Different Water Quality Indices with GIS

NPI

NPI was determined for the 54 samples for the provided water quality parameters. It also takes into account the effects of a number of variables that influence water quality. A pollution index for individual parameters was determined for each ward. The summation of the pollution indexes of all parameters in each ward gives the NPI value (Fig. 4). Table 3 shows the NPI calculation for Ward 1. The obtained NPI range is 4.88 to 49.91, i.e., from excellent to unsuitable. Figure 4 depicts the variation of NPI throughout the wards. The categorization of water quality was divided into five categories depending on the water quality rating given in Table 4. As per the NPI ratings, out of the 54 samples, 31 fall into the excellent category, 19 fall into the good category, 2 fall into the poor category, 1 in the very poor category, and 1 in the unsuitable category. The percentage wise distribution of samples shows that “Excellent Water” is found in 57% of samples, “Good Water” is 35%, “Poor Water” is 4%, “Very Poor Water” is 2%, and “Unsuitable Water” is 2%, which can be seen in Fig. 6a. To create a water quality index map in ArcGIS, NPI values were interpolated over the entire study region. The developed map can be seen in Fig. 5. The map clearly indicates that wards 34 and 17 lie in very poor to unsuitable categories.

Fig. 4
figure 4

Plot showing NPI for each ward

Fig. 5
figure 5

NPI map for Ujjain City

Fig. 6
figure 6

Percentage wise status of water quality based on (a) NPI, (b) WA WQI, (c) GWQI, and (d) CCME WQI

The WA WQI

According to WA WQI, the readings show that numerous parameters have a greater impact on water quality than a single one (Fig. 7). The unit weights (Wi) of all parameters are given in Table 5. The WQI calculation for Ward 1 is shown in Table 6. The obtained WQI range is 23.7 to 90, i.e., from excellent to very poor categories. Figure 7 depicts the variation of WA WQI throughout the wards. The water quality ratings in Table 4 for WA WQI were used to classify the entire area. According to the results of the weighted arithmetic method, 2 samples fall into the excellent category, 29 fall into the good category, 21 fall into the poor category, and 2 into the very poor category. The percentage wise distribution of samples reveals that 4% of samples are Excellent Water, 53% are Good Water, 39% are Poor Water, and the remaining 4% are Very Poor Water, which can be seen in Fig. 6b. The IDW tool in ArcGIS was used to interpolate spatial data based on location coordinates and quality parameters. Figure 8 depicts the WA WQI map interpolation across the study area. The map clearly indicates that wards 39 and 45 lie in the very poor category.

Fig. 7
figure 7

Plot showing WQI for each ward

Fig. 8
figure 8

WA WQI map for Ujjain City

GWQI

GWQI was determined for the 54 samples for the provided water quality parameters. An overall GWQI was calculated by combining all of the subindices for each parameter. The Wi of all individual parameters is specified in Table 5. Table 7 shows the GWQI calculation for Ward 1. The obtained GWQI range is 60.04 to 934.08, i.e., from good to unsuitable categories. Figure 9 depicts the variation of GWQI throughout the wards. The water quality ratings in Table 4 for GWQI were used to classify the entire area. According to the results, 14 samples are in the good category, 25 are in the poor category, 9 fall into the very poor category, and 6 fall into the unsuitable category. The percentage wise distribution of samples shows that 26% of samples are Good water, 46% are Poor water, 17% are Very poor water, and the remaining 11% are Unsuitable water, which can be seen in Fig. 6c. The IDW tool in ArcGIS was used to interpolate spatial data based on location coordinates and quality parameters. Figure 10 depicts the GWQI map interpolation across the study area. The map clearly indicates that wards 16, 17, 18, 34, 37, and 43 lie in the unsuitable category.

Fig. 9
figure 9

Plot showing GWQI for each ward

Fig. 10
figure 10

GWQI map for Ujjain City

The CCME WQI

The CCME WQI was calculated for all samples gathered from 54 wards of Ujjain City. Table 8 shows the CCME WQI calculation for Ward 1. The obtained CCME WQI range is 8.52 to 89.43, i.e., from good to poor categories. Figure 11 depicts the variation of CCME WQI throughout the wards. The categorization of CCME WQI was done into five categories as per the rating given in Table 4. CCME WQI revealed that out of 54 samples, 3 samples fall into the good category, 19 fall into the fair category, 21 fall into the marginal category, and 11 into the poor category. The percentage wise distribution reveals that just 6% of samples were in the good category. Thirty-five percent of the samples were of fair quality. Similarly, 39% of samples were marginal, while 20% of samples were found to be poor, which can be seen in Fig. 6d. To create a water quality index map in ArcGIS, CCME WQI values were interpolated over the entire study region. The developed map can be seen in Fig. 12. The map clearly indicates that wards 1, 7, 16, 17, 18, 27, 34, 37, 39, 43, and 52 lie in the poor category.

Fig. 11
figure 11

Plot showing CCME WQI for each ward

Fig. 12
figure 12

CCME WQI map for Ujjain City

MSA

MSA such as correlation analysis, PCA, and CA were used to evaluate variations and interpret a large complicated GWQ data set. MSA allows for the extraction of hidden information from a data set concerning the environment’s potential effects on water quality. In this study, these methods were used on 54 samples for 8 variables. MSA was performed on a physicochemical data matrix. Minitab statistical software was used to accomplish the statistical analysis.

Correlation Analysis

Table 9 shows the developed Karl Pearson’s correlation matrix using MS Excel 2016 for the eight analyzed physicochemical parameters. Among these, the most positive correlation (0.9996) was found between EC and TDS of water samples, which supports the fact that conductivity measurement is commonly used to estimate TDS, while the most negative correlation (− 0.2784) was found between pH and chloride. According to the results, strong positive correlations are observed between EC, TDS, and chloride; TDS and chloride; and alkalinity and hardness. Except for fluoride, pH is negatively correlated to all the parameters. Apart from alkalinity and fluoride, turbidity has a positive relationship with all other parameters. Chloride is positively correlated to hardness and negatively correlated to fluoride. Hardness is negatively correlated with fluoride. EC, TDS, and alkalinity are the parameters that are positively correlated with other parameters. The most significant correlation is between EC, TDS, and alkalinity, which has a greater impact on the total assessment of groundwater quality than any other parameter.

In Table 10, the parameter’s standard permissible limit as per BIS 10500:2012 is stated. The pH value of drinking water is the concentration of H+ ions, which defines the water as acidic or basic. It is observed that all of the samples were found to be in the BIS range of 6.5 to 8.5, and the majority of them were somewhat alkaline in nature. The turbidity ranged from 0.1 to 5.9 NTU. The turbidity of all samples is below the BIS permissible level of 10 NTU. From an aesthetic standpoint, turbid water is unappealing. The TDS is calculated using the EC of water. When the EC of water is high, the TDS value is similarly high. It is seen that the permissible limit is exceeded in 98.15% of EC samples and 62.96% of TDS samples. At ward number 17, the highest concentration of TDS was reported to be 5604 mg/l. Total alkalinity lies between 200 mg/l and 532 mg/l. Alkalinity is limited to 200 mg/l by the BIS. The permissible limit is exceeded by 98.15% of alkalinity samples. The main source of alkalinity is dissolved carbon dioxide, which is found in large concentrations in most water sources. The permissible limit is exceeded by 83.33% of the hardness sample from the study area. At ward number 37, the maximum concentration of total hardness was reported to be 990 mg/l. The permissible limit for chloride and fluoride was exceeded in 25.92% and 9.25% of the samples, respectively.

PCA

In the present study, PCA was conducted on the correlation matrix of a water sample consisting of 8 physico-chemical parameters. It is used to identify individual loadings of eight variables in water quality. Eigenvalues are commonly used to obtain principal components (PCs). The eigenvalue of a significant variable defines its maximum value. The most significant eigenvalues are those greater than 1. PCs with eigenvalues of less than 1 were excluded due to their low essentiality [54]. As seen in a scree plot diagram (Fig. 13), the first three factors have eigenvalues greater than 1. After the third eigenvalue, the slope begins to decline slightly. As a result, only the first three components have been decided, with a total variance of 72.5%. Table 11 shows the percentage variance, eigenvalues, and loadings of three PCs. In addition, these loadings were classified as strong (> 0.75), moderate (0.75–0.50), and weak (0.50–0.30) [55]. PC1 featured positive loadings for turbidity, EC, TDS, alkalinity, chloride, and hardness and negative loadings for pH and fluoride, accounting for 42.2% of the overall variance. This also suggests that these factors were tightly associated with one another, as evidenced by the correlation matrix, which signifies complete dominance in terms of water quality influence. Because of the dominance of solids in the groundwater, the first principal component, PC1, is referred to as the “solid component.” Water quality was influenced the most by EC, TDS, and chloride in PC1. PC2 explicates about 17% of the total variance. Except pH and hardness, all other variables are positively loaded, with turbidity playing a significant role. PC3 had positive coefficients for turbidity, chloride, and hardness but was negatively loaded with pH, EC, TDS, alkalinity, and fluoride, accounting for only 13.3% of the total variation. It could be attributed to the dissolving of solids, which results in water turbidity and contamination due to soluble salts of chloride.

Fig. 13
figure 13

Scree plot of principal component analysis

Table 11 Component loading and eigenvalues from PCA

Cluster Analysis

Water samples were grouped using cluster analysis at each sampling location based on chemical and physical characteristics. All variables were standardized for hierarchical cluster analysis (HCA), and dendrograms were generated using Ward’s technique with Euclidean distances. Using eight variables (pH, EC, TH, TDS, alkalinity, Cl, F, and turbidity), HCA was applied to the 54 sampling locations. The results are depicted in Fig. 14 as dendrograms. Figure 14a shows a dendrogram depiction of parameter cluster analysis. Here, cluster 1 includes two parameters: pH and fluoride. Turbidity, EC, TDS, and chloride are the four parameters covered by Cluster 2. Cluster 3 includes alkalinity and hardness.

Fig. 14
figure 14

(a) Cluster analysis of 8 physicochemical parameters. (b) Cluster analysis of 54 sampling locations

Figure 14b shows a dendrogram view of the cluster analysis of sampling locations. CA creates three clusters from 54 sampling locations. Cluster 1 consists of 23 sampling locations, which are grouped into two subgroups (1, 4, 6, 8, 10, 16, 2, 3, 5, 12, 13, 15, 18) and (7, 9, 11, 14, 17, 20, 19, 22, 23, 26). Because 7 samples are of poor quality and 16 are of good quality, this cluster is classified as a moderately polluted zone. Cluster 2 is made up of 14 sampling locations that are divided into two subgroups (21, 25, 27, 30, 31, 33, 36, 41, 40, 42) and (34, 43, 39, 45). Because 12 samples are of poor quality and two samples are of very poor quality, this cluster is referred to as a highly polluted zone. Cluster 3 is formed of 17 sampling stations divided into two subgroups (24, 28, 29, 32, 35, 37, and 38) and (44, 51, 46, 47, 48, 49, 50, 52, 54, 53). This cluster is known as a “low-pollution zone.” There are 13 stations with good quality (24, 28, 29, 32, 35, 37, 38, 46, 47, 48, 49, 50), two stations with excellent quality (44, 51), and two stations with poor quality (52, 54).

Discussion

The results revealed that the quality of groundwater varies considerably depending on the location. After evaluating NPI, WA WQI, GWQI, and CCME WQI, it was determined that majority of the water samples were found to be safe for drinking. Because numerous physicochemical properties of the samples are below acceptable limits as per BIS 10500:2012, the WQI value is likewise lower, indicating that the water is fit for human consumption. Also, a spatial distribution map of the WQI were developed with the help of ArcGIS 10.8 software which can be seen in Figs. 5, 8, 10, and 12, demonstrating which parts of the groundwater are fit for drinking. However, all four indexing methods and WQI maps reveal that samples from wards 20, 44, and 47 are in the excellent-to-good category, and samples from wards 17, 34, and 43 are in the poor-to-very poor category. As a result, the groundwater samples collected from these three wards are unfit for direct human consumption and must be treated before being consumed. This is despite the fact that numerous researchers have attempted to apply and compare various WQI methods. The findings of this study when it comes to the relative responses of these four WQI methods are in good accordance with some other publications [47, 56, 57].

As shown in Fig. 3a–h, thematic maps of all groundwater quality parameters have been created. The pH distribution pattern shows the presence of alkaline groundwater except for the central part (Fig. 3a). The EC is at 8673 mmhos/cm, with tiny spots in the central east (Fig. 3b). Due to inadequate fluxing and severely worn rock formations, a tiny patch in the city’s central east exhibits TDS of 5604 mg/l in groundwater (Fig. 3d). This is in accordance with the higher EC (strong positive relation between EC and TDS). The alkalinity map clearly shows that it is largely in the middle portion of the city, with a few higher values in the northern and southern parts (Fig. 3e). The alkalinity of groundwater and hardness are found to have a strong positive correlation. This is reflected in the hardness map (Fig. 3c), which shows that the study area has hard groundwater. Chloride levels are highest in the central east, at 1320 mg/l, and it is clearly visible as a small patch (Fig. 3f). The correlation matrix clearly shows that EC, TDS, and chloride have a high positive correlation, which is evident in their spatial distribution maps. F is an essential parameter that is found in isolated patches. The maximum concentration of 1.7 mg/l is found in the central part of the area (Fig. 3g). Keshavarzi et al. [58] state that fluoride in high concentrations (> 3.0 mg/l) can cause skeletal fluorosis. The turbidity map illustrates that the highest values are found in the eastern portion of the city, while the lowest values are seen in the western and central parts (Fig. 3h).

Positive values of the correlation coefficient indicate a strong and direct relationship between variables, whereas negative values indicate inverse relationship. As per Tirkey et al. [59] correlation analysis, 0.9 ≤ R2 ≤ 1 is considered to be strong, 0.9 ≤ R2 ≥ 0.5 is moderate, and R2 < 0.5 is poor. The obtained results show a strong correlation of 0.9996 between EC and TDS, a moderate correlation between EC and chloride 0.7067, TDS and chloride 0.7028, and alkalinity and hardness (0.6015). The key contributors to water quality degradation are EC, alkalinity, hardness, and TDS as per Table 10, which reveals the summary of parameters based on the BIS 10500:2012 permissible limits. The high concentration of TDS is due to an increase in salts containing carbonates, bicarbonates, sulfates, calcium, chloride, sodium, potassium, and other ions [60]. Because these parameters can reduce water clarity, decrease photosynthesis, and induce gastrointestinal irritation in people, it is necessary to treat the water before it is consumed. However, pH that has a direct impact on water taste was found to be within acceptable limits. Drinking water with a high chloride concentration has a salty taste to it, making it unfit for consumption. Infants and children may be harmed if they drink chloride-rich water [20]. Fluorosis in the teeth and bones can be caused by an overabundance of fluoride in the water. However, the concentrations of Cl, F, and turbidity are all found to be within acceptable limits, indicating that they have little impact on human consumption.

According to the PCA, the variables were correlated to three PCs that were reported to account for 72.5% of the overall variation in groundwater samples. Positive loadings for EC, TDS, hardness, and Cl were found in the first component (PC1), which contributed 42.2% of the total variation. This can be attributed to the natural water source and is referred to as “water hardness salinity” [39]. Water quality is positively influenced by dissolved chloride salts in PC1. PC2 explains about 17% of the total variance, with turbidity, EC, TDS, chloride, and fluoride all having a positive influence. Turbidity shows the most significant loading on PC2. The physical characteristics of water, such as cloudiness, could be represented by PC2 [52]. PC3 displayed substantial positive coefficients (turbidity, chloride, and hardness) but was negatively loaded with pH, EC, TDS, alkalinity, and fluoride, accounting for just 13.3% of the total variance. This factor PC3 may be attributed to chloride solids dissolving, which causes water to become more turbid. Figure 13a shows the dendrogram view of the cluster analysis of 8 physicochemical parameters that are clustered into three groups of the same water quality characteristics. The CA allowed three clusters to be established between the sampling locations, indicating variances in water quality at various locations. According to the dendrograms, the study area was classified into three main groups: low, moderate, and high polluted zones based on correlation between physicochemical parameters and sampling locations. CA supports the results of the correlation matrix.

In the present study, only four indexing methods were used to access the GWQI. This is owing to the availability of only eight physicochemical parameters. If more parameters are available, more indexing methods can be used. Furthermore, the current work is done for the summer of 2020. Future work can be done for the monsoon and post-monsoon periods to give a comparative result. The assessment of groundwater quality is of utmost importance as it is the primary source of drinking water in the study area. Therefore, a competent management plan must be enacted before its quality deteriorates.

Conclusion

The study used 54 groundwater samples taken during the summer of 2020 from 54 wards of Ujjain City, Madhya Pradesh, India, for water quality assessment using an integrated approach of different water quality indexes with GIS and multivariate statistical analysis. WQI categorizes water based on various parameters, culminating in a composite unit that may be used to determine the quality of water with a single numeric value. The results of different indexing methods, such as NPI, reveal 92% of samples to be in the excellent to good category, WA WQI shows that 57% of samples are in the excellent to good category, GWQI illustrates 72% of samples to be in the good to poor category, and CCME WQI reveals 80% of samples to be in the good to marginal category. According to the outcome of all the indexing methods, it is clear that the majority of the water is in good condition and thus suitable for drinking, with the exception of a few locations that require treatment. Table 4 reveals the classification of WQI values for each ward, calculated using different indexing methods. GIS aids in the conversion of point data into special data, allowing for the classification of areas with excellent and poor water quality. Groundwater quality mapping was implemented using IDW in ARCGIS 10.8 for the eight physicochemical parameters shown in Fig. 3a–h. The WQI spatial distribution map (Figs. 5, 8, 10, and 12) clearly depicts the area’s finest drinking locations. According to BIS 10500:2012 standards, physicochemical parameters such as EC, TDS, and TH are above permissible limits. Within the study area, correlation analysis reveals strong and positive correlations between EC and TDS. PCA was used to determine three components in the water quality data set generated in this study. The first component appears to be associated with the presence of solid components, the second component appears to be related to the impact of water cloudiness due to turbidity, and the third component appears to be associated with the changes in water quality due to the presence of soluble chloride salts. CA groups 54 sampling locations into 3 clusters of similar water quality characteristics as seen in Fig. 14a, and 3 clusters of low, medium, and high polluted zones as per sampling locations in Fig. 14b. The PCA and CA supported the results of the correlation matrix. The study confirms that multivariate statistical analysis techniques, including correlation analysis, PCA, and CA, are effective in evaluating spatial variability and identifying contamination sources in the studied area. Moreover, an integrated approach of different water quality indexes using GIS reveals the most common wards that are fit or unfit for human consumption according to all indexing methods. According to the findings, samples from wards 20, 44, and 47 are the most common and in the excellent to good category, whereas samples from wards 17, 34, and 43 are the most common and in the poor to very poor category. It is therefore recommended that monitoring and management should be prioritized in order to safeguard the groundwater resource from pollution and to provide technologies to make groundwater suitable for drinking purposes.