1 Introduction

The increase in the demand of water entails serious pollution problems, both chemical and biological, whose correction requires costly control systems. In this regard, it is estimated that diarrheal diseases of the total daily global disease burden and cause 1.5 million deaths each year (https://www.who.int/water_sanitation_health/diseases-risks/en/). According to estimates, 58% of this disease burden - that is 842,000 deaths per year - is due to the absence of safe water and poor sanitation and hygiene, and includes 361,000 deaths of children under five, most of them in low-income countries (https://www.who.int/news-room/fact-sheets/detail/drinking-water). According to this information, it is important to establish mechanisms and actions aiming at conserving and protecting the quality of continental and underground waters from the effects of pollution sources and climate change (Whitehead et al. 2009; Vörosmarty et al. 2000; Murdoch et al. 2000; Delpla et al. 2009; Wilby et al. 2006) to establish a balance of the aquatic ecosystem.

The necessary process of monitoring the quality of water in natural bodies is carried out in the countries through the efforts of national environmental authorities, which allows the observation, evaluation and generation of reports so that decision-makers take the necessary actions. Thus, public institutions have been monitoring the water quality of natural resources for various purposes through several criteria and methodologies established in water quality monitoring protocols.

Actually, there are several monitoring protocols whose content is similar in the technical aspects, which establish the appropriate procedures with different criteria for taking water samples, either from industrial discharges (mining, cement industries, brewing, paper, etc.), domestic discharges or natural water bodies (continental and marine) (Kanu and Achi 2011; Tüfekci et al. 2007; Bengraıne and Marhaba 2003). They also present the criteria for the selection of parameters, monitoring points, frequency of monitoring, analysis methodology, flow measurement and procedures for quality assurance.

The water quality index (WQI) is a tool that allows to classify the water quality of a surface or underground body in a given time. In general, the WQI incorporates data from multiple physical, chemical and biological parameters into a mathematical equation, through which the state of a body of water is evaluated (Cude 2001; Pesce and Wunderlin 2000; Poonam et al. 2013; Fernández et al. 2004; Ocampo-Duque et al. 2013). Through the WQI, a general analysis of water quality can be carried out at different levels, and the vulnerability of the body can be determined in the face of potential threats. This tool emerges as an alternative for the evaluation of water bodies, allowing the processes of formulating public policies and monitoring impacts to be more effective.

Historically, different organizations from different countries involved in the control of aquatic resources, and especially in the control and monitoring of water for drinking purposes, have used different methodologies for the calculation of indices for the quantification of water quality (Horton 1965; Brown et al. 1970; Sˇtambuk-Giljanovi’c N 1999; Tyagi et al. 2013). Many modifications have been considered to the WQI concept through various scientists and experts. However, a general water quality index (WQI) is based on the most common factors, which are described in the following five steps:

1. Establish and select a number of factors or parameters that have a strong correlation with water quality, i.e., physical parameters, chemical parameters, composition and biological factors that imply contamination.

2. Apply transformations to standardize each of the previous factors on a common scale of measurement.

3. Decide on the relative weight of each of the factors in the incidence of the quality of the resource.

4. Establish an algebraic function for the calculation of a single factor that determines the quality of the resource.

5. Establish a qualitative scale based on the values of the calculated index.

Despite this, the water quality index can often be very subjectively defined. This is because sometimes it can be generalized and assigned greater or lesser weights to different variables depending on particular choices; or it could be not indicative of the quality of the total dynamics of the system but be suitable for the measurement of some kind of water use; or the index could be limited in space and time which could give wrong readings at specific points or times. If two water quality indices that use the same risk criteria for each of the measured variables, even if they use different logical-mathematical expressions, offer different scenarios, then it is clear that some of these indices may not reflect the dynamics of the analyzed ecosystem. This is a big problem for communication of the results of water quality estimates. Regardless of the methods used and knowing that there is no universal index of water quality, many methodologies could converge in common scenarios. One purpose of this article is to fairly assess how representative are those two types of water quality indices to evaluate their use for human consumption. Finally, this article reviews the relevant methodologies for calculating water quality indices as an important input for the actions that could be taken by the agencies and institutions responsible for decision-making while discussing the suitability of water for human consumption, based on compliance with drinking water standards.

2 Study Area

This study is carried out throughout the department of Santander, located in the northwestern part of Colombia. Santander is one of the thirty-two departments of the country. Its capital is Bucaramanga. It is located in the Andean region and borders the Cesar and Norte de Santander departments on the north, Boyacá on the southeast, Antioquia on the west, and Bolivar on the northwest. The population was 2,060,000 inhabitants in 2015, and it is the sixth department by population size. The area, which comprises approximately 30,537 km2, is delimited within the latitudes 5° 41′ 0.8988” N to 8° 8′ 38.292” N and longitudes 74° 32′ 24.0756” W to 72° 28′ 10.83” W (Fig. 1).

Fig. 1
figure 1

a Location of the Department of Santander in Colombia. b Distribution of the 87 municipalities of the Santander Department in which samples have been collected for the study

The climate of Santander is affected by the diversity of altitudes, its land is distributed in warm, temperate and moorland bioclimatic thermal floors. In the lower Magdalena valley, average temperatures are in the order of 29 °C and there is abundant rainfall, registering up to 3800 mm per year; in the flank of the mountain range, the temperature decreases and the annual average precipitation is between 1500 and 2000 mm; with the exception of the south, especially in the Chicamocha Canyon where precipitation is less than 500 mm, and temperature is high, reaching values up to 32 °C; the moorland area records temperatures below 7 °C and low precipitation.

The hydrographic network of Santander (Guzmán 2016; Fernández-Méndez et al. 2013) is composed of numerous rivers, streams and minor streams, including the Magdalena, Cararé, Lebrija, Opón, Sogamoso rivers (formed by the confluence of the Chicamocha and the Suarez River), Cachira, Chucurí, Ermitaño, Fonce, Guaca, Guayabito, Horta, La Colorada, Nevado, Onzaga, Paturia, San Juan and Servita. There are also several marshes located in nearby areas of the Magdalena River; among them, the most notable are Colorada, Doncella, El Llanito, Opon, Paredes, Rabon, Redonda, San Silvestre and Yariguíes.

3 Water Quality Index Calculation

The National Institute of Health (INS: Instituto National de Salud), in compliance with Decree 1575 of 2007 and its regulatory resolutions, which establish the system for the protection and control of water for human consumption, developed the application “Information System of Surveillance of Water Quality for Human Consumption-SIVICAP”, which allows all departmental health authorities to report the data of the monitoring of water quality, based on their inspection activities, surveillance and control in the country.

The SIVICAP WEB system allows the online reporting of water quality information and includes the calculation of the IRCA (for its acronym in Spanish: Indice de Riesgo de Calidad del Agua or Water Quality Index), IRABA (for its acronym in Spanish: Indice de Riesgo Municipal por Abastecimiento de Agua or Municipality Risk Index for Water Supplying), BPS (for its acronym in Spanish: Buenas Practicas Sanitarias or Good Sanitary Practices) and RISK MAP indicators. Also, it allows sharing more efficiently the information generated and updated with different direct or indirect users of the sector.

Technically the IRCA index is defined as the degree of risk of disease occurrence related to non-compliance with the physical, chemical and microbiological characteristics of water for human consumption. This indicator is the result of assigning the risk score contemplated in Table 2 to the characteristics contemplated there for non-compliance with the acceptable values established. When the resulting score is between 0% and 5%, the water distributed is suitable for human consumption and is rated at the No-Risk level. When the IRCA is between 5.1% and 14%, is rated with a low-risk level and it is no longer suitable for human consumption; between 14.1% and 35% it qualifies with Medium level of irrigation and is not suitable for human consumption; between 35.1% and 80% the risk level is High and between 80.1% and 100% the water distributed is sanitary nonviable. When the monthly IRCA indicates that water is not suitable for human consumption, sanitary authority orders a series of actions for its improvement, being the most drastic when water is sanitarily unfeasible.

IRABA, on the other hand, is the weighting of two factors: (1) Treatment and continuity of the service of the aqueduct systems; and (2) Water distribution in the area of the corresponding municipality, which may indirectly affect the quality of water for human consumption and therefore, human health. This index aims to associate the risk to human health caused by the supply systems and establish the respective levels of risk, both at the level of the Provider (IRABApp) and at the level of municipality (IRABAm), where it takes into account the sum of the risk indices of all the suppliers for every municipality.

According to SIVICAP and INS (Nava et al. 2011) and National Decree (MADS 2007), the calculation of the risk index of water quality for human consumption - IRCA, is based on the following formulas:

$$ IRCA\left(\%\right)=\frac{\sum_j{A}_j}{\sum_j{N}_j}\times 100\% $$
(1)

where Aj is the risk scores assigned to unacceptable characteristics, Nj is the risk scores assigned to all the characteristics analyzed, and the monthly average

$$ IRCA{\left(\%\right)}_{prom}=\frac{\sum_j{B}_j}{N} $$
(2)

where Bj is the IRCAs obtained in each sample carried out in the month, and N is the total number of samples taken in the month.

In this sense, an acceptable characteristic (or acceptable measurement variable) is considered to be the one in which measurement is below the maximum acceptable value according to INS criteria. Of course, an unacceptable characteristic is that for which the value is above the maximum acceptable. Such criteria are shown in Table 1.

Table 1 Maximum acceptable values for each of the physical, chemical and biological characteristics used to calculate the IRCA (MADS 2007)

For the calculation of the IRCA, the risk score is assigned to each physical, chemical and microbiological characteristic, for non-compliance with the established acceptable values (see Table 2).

Table 2 Risk score for the calculation of the IRCA (Nava et al. 2011; MADS 2007). Sum of assigned scores is equal to 100

The value of the IRCA is zero (0) when it complies with the acceptable values for each of the physical, chemical and microbiological characteristics contemplated in the resolution and one hundred (100) for the highest risk when none of them is met.

According to the IRCA result per sample and the monthly IRCA, the following classification of the risk level of the water supplied for human consumption by the provider is defined, and the action that the competent health authority must carry out are indicated (see Table 3).

Table 3 Classification of the level of health risk according to the IRCA per sample and the monthly IRCA and actions that must be carried out (Nava et al. 2011; MADS 2007)

On the other hand, the IDEAM (for its acronym in Spanish: Instituto de Hidrología, Meteorología y Estudios Ambientales, or Institute of Hydrology, Meteorology and Environmental Studies) adopted the UWQI methodology (Universal Water Quality Index), which was developed and applied in order to obtain a simplified index to establish the quality of water used for human consumption. For its calculation, there is an additive-type equation or weighted sum was developed (Lozada et al. 2009), whose structure is the one presented in Eq. (3):

$$ UWQI={\sum}_{j=1}^n{W}_j{I}_j $$
(3)

where Wj is the weight or percentage assigned to the j-th parameter and Ij is the subscript of the j-th parameter. The IDEAM (2015) adopted six basic variables for the determination of the IRCA in the bodies of water: one state (dissolved oxygen) and five pressure (chemical oxygen demand, COD; electrical conductivity, EC; total suspended solids, TSS; pH; and total nitrogen to total phosphorus ratio).

The values of the IRCA of the IDEAM comprise a scale from zero to one, in five categories: very bad, between 0.00 and 0.25 (represented with a red color); bad, between 0.26 and 0.50 (orange color); regular, between 0.51 and 0.70 (yellow color); acceptable, between 0.71 and 0.90 (green); and good, between 0.91 and 1.00 (blue color).

In this study, the WQI, which is calculated using the weighted arithmetic index method (Brown et al. 1970) is used to determine the effect of waste dumping on the immediate ground and surface water- bodies to the dumpsite, as it is deemed the most appropriate, based on the prevailing conditions.

The WQI is given as:

$$ WQI=\frac{\sum_j{q}_j{w}_j}{\sum_j{w}_j} $$
(4)

where qj = quality rating (sub index) of j-th water quality parameter, wj = unit weight of j-th water quality parameter (sum of all wj is equal to one). Also, qj, which relates the value of the parameter in polluted water to the standard permissible value is obtained as follows:

$$ {q}_j=\frac{v_j-{v}_{lim}}{s_j-{v}_{lim}}\times 100\% $$
(5)

where vj = estimated value of the j-th parameter, vlim = ideal value of the j-th parameter, sj = standard permissible value of the j-th parameter.

In most cases, vlim = 0 except for pH and DO. In fact, for pH, vlim = 7, and for DO, vlim = 14.6 mg/L. The unit weight (wi), which is inversely proportional to the values of the recommended standards is obtained as:

$$ {w}_j=\frac{k}{s_j} $$
(6)

where \( k={\left({\sum}_{j=1}^n1/{s}_j\right)}^{-1} \).

The rating of the water quality using the above method is shown in Table 4 (Brown et al. 1972; Chatterjee and Raziuddin 2002).

Table 4 Rating of Water Quality for various WQIs (Brown et al. 1972; Chatterjee and Raziuddin 2002)

4 Data, Methods and Parameter Selection

4.1 Data Available

The data used comes from the Open Data website of the Ministry of Information and Communication Technologies (www.datos.gov.co) supplied by the National Institute of Health. The data corresponding to the year 2015 has been selected in the study. The original data consists of 1019 records with 47 fields or columns of values for 40 characteristics or measured parameters. From this dataset, there were selected 14 parameters, namely: appearance, color, turbidity, pH, residual chlorine, alkalinity, calcium, magnesium, total hardness, sulfates, iron, chlorides, nitrites, total coliforms and Escherichia Coli.

In the filtering process, 87 records have been kept corresponding to each municipality of the department. In addition to this, a subset of features has been selected, due to a large amount of missing data for some parameters that have not been measured or have not been measured correctly. The selected parameters for water quality index in this study are those used for the development and sensitivity analysis of the global drinking water index (Drinking 2007) with greater weight and importance. These parameters were carefully selected to accurately reflect the main problems of acceptability and health related to water quality. In addition, factors including the level of detection and the overall ability of researchers and stakeholders to accurately measure parameters in most parts of the world were considered.

4.2 Water Quality Index Calculation

For the set of original data, the index for quality of water for drinking use was calculated from Eqs. (1) and (2) by the National Ministry of Health. However, in order to make comparisons for the calculation of WQI in its international standard form, Eqs. (4), (5) and (6) have been used. At the beginning was important to carry out filtering and cleaning of the data, because there was a lot of data lost concerning some of the variables considered. In this way, the variables with the greatest number of complete results were selected and the data set was homogenized. For these data, 14 variables of physical, chemical and biological parameters were selected for the study: apparent color, turbidity, pH, residual chlorine, alkalinity, calcium, magnesium, total hardness, sulfates, iron, chlorides, nitrites, total Coliforms and Escherichia Coli.

Likewise, the maximum values allowed for each parameter measured in the water samples were taken into account. These values are defined in the Colombian standard (MADS 2007) and are described in the Table 1.

4.3 Analysis of Water Samples

To evaluate the quality of surface water, eighty-seven different sampling stations were selected based on the position and geolocation of each of the municipalities of Santander with areas of probable impact and specific effluent discharge points. For each of the municipalities, they took several samples at different geographical points. The number of samples that were taken into account with good measurement results can be seen in Fig. 2. The standard analytic procedures recommended by the Colombian standard in the manual of basic analytic methods for the analysis of waters for human consumption were used in the present study.

Fig. 2
figure 2

Number of samples tested at each location in each municipalities

The obtained data is part of a citizen science strategy, in which the corresponding authorities provide and supply institutions and organize society with the material, technical elements and also measuring instruments for the collection and analysis of water samples. The data analyzed in this study correspond to those collected in the first quarter of 2015. Then, each institution and groups of citizen scientists take samples and measure the characteristics of the samples collected following a manual of procedures designed for this purpose. Samples and the results of the measurements are sent through a digital platform to central processing and are also sent to the central laboratories of the health institution, and a preliminary analysis of the conditions and quality of collected data is carried out. Of all the samples collected, a preliminary analysis is made to determine which of them meet the requirements of the procedure manual. Data that do not meet the quality criteria are discarded and those that meet the standards defined in the procedures manual are maintained. At this point, data that does not meet a minimum of parameters measured for the determination of the water quality index is also discarded. Health authorities geographically determine sample collection points, so that all participants must collect samples at specifically defined points. Those samples that do not meet this requirement are also discarded. Despite having different geographic points for data collection, the health authority has determined all these collection sites based on the geographical distance between points which does not exceed the amount of 1 km and on the same tributary, whereby it can be assumed that geographical dispersion is not an error factor in the calculation of water quality indices.

At this point a homogeneity and regularity of the data collected are ensured. Therefore, information is obtained that represents the sample collection area regardless of the positions in which they have been measured. Thus, before calculating the water quality index, averages of each characteristic measured are taken for all samples, and based on these averages the water quality index is calculated according to the methodology of Eqs. (1) and (2).

4.4 Multivariate Analysis Methods

Multivariate Factor Analysis is a class of multivariate statistical method whose main purpose is to expose the underlying structure in a data matrix. Analyzing the structure of the interrelations between a large number of variables does not require any distinction between dependent and independent variables. Using this information, one can calculate a set of latent dimensions, known as factors, that seek to explain these interrelationships. It is, therefore, a data reduction technique, the information contained in the data matrix can be expressed without much distortion in a smaller number of dimensions represented by said factors.

In order to evaluate the significant differences between the sites for all the water quality variables, the data were analyzed through the analysis of variance. The multivariate analysis of the water quality data sets was done through hierarchical group analysis (HCA) and principal components analysis (PCA) (Jolliffe 2011). The objective of clustering is to divide the objects into homogeneous groups so that the similarities within the group are large compared to the similarities between groups. The Principal Components, on the other hand, are extracted to represent the patterns that encode the highest variance in the data set and not to maximize the separation between groups of samples directly. The statistical package used in this case was R version 3.4.4 (2018-03-15) (R Core Team 2018; Bunn 2008, 2010). The software was used for both the HCA and the PCA.

The elaboration of risk maps in the studied area as well as the reduction of the data was possible thanks to the use of a script written completely in Python 3.6, using standard libraries of Pandas 0.24.0 (McKinney et al. 2010), Matplotlib 3.0.2 (Barrett et al. 2005), and Scipy 1.0.2 (Oliphant 2007). Several steps were taken to construct a water table contour map; in this case for this work, we used Basemap 1.0.8 extension (Whitaker 2017) taking into consideration the geographical location of all samples.

5 Results and Discussion

As mentioned earlier, quantification through a WQI is a valuable tool to examine and establish criteria to evaluate the general state of water quality in a single factor that is useful for the selection of the appropriate treatment technique to solve problems and inform to authorities and decision-makers about the actions that must be taken to solve such problems in terms of drinking water consumption. In this study, the WQI has been calculated for the 87 municipalities of the Department of Santander for an equivalent number of samples between 3 and 45 in each station.

The graphs of Fig. 3 indicate the variation of the WQI in the sampling stations using both the conventional and official methodology for the Colombian State, and the international standard methodology for the calculation of the WQI. As expected, most sites are within acceptable limits with any of the calculated indices.

Fig. 3
figure 3

Water quality index variation in the two methodologies applied

In Figs. 3 and 4 one can observe that during this period of the study, most of the stations reported water either with low risk or risk free (≈ 57%), or of excellent and good water quality (≈ 73%) according to the methodology used for the quality index. However, it is evident that according to the index measured by the Colombian standard methodology, none or very few samples have been reported of high risk or sanitary infeasibility, while with the standard international calculation method, a little more than 25% of the stations report poor or nonviable sanitary water quality, with about 16% of these unsuitable for human consumption. In the method used to calculate water quality in this study, the most affected stations measuring water quality from poor to unfit were stations 1–4, 11, 14, 28–30, 34, 37, 58, 63–64, 68, 73, 75, 79, 81, 84 and 85. An inevitable consequence of this result is that the methodology used by local agencies in charge of monitoring water quality in Colombia may be underestimating the high risk levels for water quality.

Fig. 4
figure 4

Comparison of criteria and levels of Water quality index variation in the two methodologies applied

Table 5 presents the average values of the physical and chemical parameters measured at the different sites. The analysis of this table shows that there is a highly significant difference (p ≤ 0.001) among the sites when considering all the physical and chemical parameters.

Table 5 Results of analysis of variance showing the effect of stations on physico-chemical parameters

In addition, by examining Table 5 and comparing it with Table 1 for the maximum permissible values, we notice that some of the characteristics have third quartiles above these maximum permissible values. In effect, the measured biological parameters report that at least 25% of the samples have values above the maximum allowable and alarmingly the total coliform content exceeds the barrier of 400 UFC/100 mL. In addition, the content of E. coli also exceeds the maximum allowable in 50% of the samples.

The results of the principal component analysis reveal that the first two independent variables that result from the decomposition study add up to 46% of the variability in the influence of physicochemical parameters at all stations.

The analysis of the correlation between the physicochemical parameters and their contributions to these variables (Fig. 5) shows that the biological parameters total coliforms and E. coli and the physical variables apparent color and turbidity, and the chemical variable nitrites, are positively correlated with dimensions 1 and 2. Likewise, the content of sulphates, magnesium and calcium, as well as alkalinity and total hardness are strongly correlated positively with dimension 1 and negatively with dimension 2.

Fig. 5
figure 5

Projection of physical and chemical parameters on the factorial planes determined for principal component analysis

The analysis of the projection of the different groups of samples (Fig. 6) on the dimensions of independent variables shows that there is a very well identified group which corresponds to the sampling sites 2, 35, 69, 76 and 80. This group identified as cluster 3, is presented with very high values of content of biological parameters total coliforms and E. coli, physical variables apparent color and turbidity, and chemical variables such as nitrites, while with low values of the remaining variables. Similarly, a second group identified as cluster 2, is presented with high values of sulfate, magnesium and calcium content, as well as alkalinity and hardness, and low values of the other parameters.

Fig. 6
figure 6

Projection of stations and groups on factorial determined for principal component analysis

In the current data set, there are 6 factors (Tables 6 and 7) that can be designated as intermediate compounds. The first intermediate factor includes alkalinity (with a weight of 0.20), calcium (weight of 0.17), magnesium (weight of 0.21) and hardness (weight of 0.21). The second intermediate compound includes apparent color (weight 0.28), turbidity (weight of 0.24) and nitrites (weight 0.13). The third factor includes the biological variables coliforms (with a weight of 0.19) and E. coli (with a weight of 0.26). The last three intermediate compounds include pH (weight 0.33), residual chloride (weight 0.44), iron (weight 0.19), sulfates (weight 0.31), and chlorides (weight 0.44). In addition, six intermediate compounds are added with weights assigned in proportion to the variance explained. After the determination of the weights, the final aggregation can be performed using a hybrid aggregation of weighted arithmetic and weighted harmonic mean (Panda et al. 2016; Esty et al. 2005; Mohd Ali et al. 2013; Tripathi and Singal 2019). In this case, the weighted harmonic average of the individual factor loads and the weighted arithmetic average are calculated to obtain the final results. This methodology could give clues about a modification of the core weights for the risk score for the calculation of the IRCA (Table 2).

Table 6 Squared Factor loadings of water quality data based on 6 PCs
Table 7 Squared Factor loadings (scaled to unity sum)

The water table contour maps for the samples and the calculated water quality index were constructed using a cubic interpolation method with the help of GIS software (Surfer) and Basemap tool for Python. In this case it can be seen that the equally spaced contours define areas with a higher water quality index, reflecting very poor quality of this element. Moreover, according to Fig. 7, it can be noted that the areas with the worst water quality are located towards the center of the Department of Santander, which is dominated by topographic conditions of high altitudes and quite irregular territories but which coincide with municipalities of high oil activity and mining, as are the municipalities of Albania, San Vicente De Chucurí, Sucre, El Playon, Gambita and Guadalupe. So, the high rates evidenced in these municipalities could be linked to the commercial activity of each one respectively. Besides this, these municipalities are generally small, not exceeding 12 thousand inhabitants, and also far from the capital Bucaramanga.

Fig. 7
figure 7

Heatmap for the WQI evaluation on sample points

6 Conclusions

This study has shown that WQI is a powerful and simple tool, which can be used to accurately determine the impact of different pollution activities on ground and surface waters. We have made water quality analysis for the northwest area in the Andean region of Colombia. For this analysis, a comparison of two different methods for the calculation of WQI was made demonstrating that the method used locally in Colombia generally leaves very good water quality in most of the area, compared to the estimate using an international version of WQI. According to the latter, severely deteriorated conditions were detected during the year in the stations of the rivers and sources of urban water supply, implying that this quality has been overestimated. It has been shown that surface water in the sampling stations is not suitable for drinking and for other daily uses of the farmers who live there. It was found that most surface water sampling sites have a moderately satisfactory or poor water quality, with several exceptions.

This study showed the efficiency of multivariate statistical techniques and the water quality index to analyze and interpret a set of data for an effective evaluation of surface waters. From the above, it is necessary to implement standards and more thorough and rigorous evaluation for the preservation of the health of people who depend on the water resource as well as the preservation of the mineral necessary for the sustainability of the population.

Individual evaluations for the analysis of the water quality of the stations has been possible thanks to the examination of surface water quality parameters found in three groups. This clustering allows efficient management of surface water quality. The importance of the parameters in the studies of contamination and evaluation of water quality from a PCA analysis shows that the parameters that are most important are in factor 1 and the least important are in the second factor.

This article uses advanced statistical techniques of factor analysis in the estimation of weights for the development of a new water quality index. Statistical techniques make the final index more objective by nature, and therefore, reduce any bias in its application in different locations. In addition, factor analysis (FA) uses these groups to provide weight to the individual parameters, as well as to the “loads” of factors that represent the extension of the principal components analysis (PCA) to FA. In this study, FA was applied to 9 parameters preselected by PCA. This study it can be concluded that alkalinity, calcium, magnesium and hardness contribute to Factor 1, apparent color, turbidity and nitrites to Factor 2, coliforms and E. coli to Factor 3, pH, residual chloride, iron, sulfates and chlorides to other factors. In addition to the conclusions of this study, the results will be used in the future for the development of a new Water Quality Index.