Introduction

Rivers are among the most diverse and threatened ecosystems on Earth (Sabater et al. 2013). They are established reservoirs of biodiversity, provide key services to the society, and have a plethora of esthetic and economic benefits. As such, almost all great civilizations of the world have flourished along large perennial rivers. However, the ever increasing anthropogenic impacts have majorly impacted and altered these ecosystems. Throughout the world, diverse conservation and management strategies for rivers have been formulated and implemented. Evaluation of the actual state or “health” of rivers has become a prerequisite to all such strategies. It is well-known that rivers and streams are characterized by high spatial and temporal variations and conventional analytical water quality assessment methods involving physical and chemical variables have been considered insufficient for them (Li et al. 2010).

In such ecosystems, monitoring and assessment using the resident biota provides both an integrative view of the effects of human influences and a rich variety of signals that can be used to diagnose the causes of degradation (Karr 2006). Biological assessment has thus been recommended for ecological assessment of lotic ecosystems as it is more reliable, is relatively inexpensive, and provides a synergistic and holistic approach (Chutter 1998). Various groups of organisms such as phytoplankton, zooplankton, macroinvertebrates, periphyton, macrophytes, and fish have been recognized as efficient biomonitors (Hering et al. 2006, Resh 2008, Hughes et al. 2012, Wu et al. 2014, Na et al. 2014). However, macroinvertebrates and periphyton are the most recommended groups that can integrate the effects of multiple environmental stressors over time (Stevenson and Pan 1999, Kar and Chu 2000), thereby reflecting the ecological status of aquatic ecosystems. Among the periphytic communities, diatoms have been established as robust biomonitors and for long have been used for assessment of environmental conditions in streams and rivers (Stevenson et al. 2010). Numerous reasons, as to why diatoms are used as tools of biomonitoring, have been listed by Round (1991) and McCormick and Cairns (1994). Since diatoms are known to be responsive to various physical, chemical, and biological factors such as temperature and salinity (Nielsen et al. 2003, Resende et al. 2005, Zong et al. 2010, Li et al. 2014, Schröder et al. 2015, Yung et al. 2017), nutrient concentrations (Kelly 1998), organic enrichment (Coste in Cemagref 1982, Sladecek 1986, Watanabe et al. 1986, Descy and Coste 1991), and herbivory (McCormick and Stevenson 1989), they are considered as valuable indicators of environmental changes in aquatic ecosystems. The use of diatoms as ecological indicators has also been extended to ecosystems other than surface water such as soil (Antonelli et al. 2017) and subsurface ecosystems. With reference to water resource management, the importance of interconnections, interactions, and interdependence of surface and groundwater ecosystems has been well established (Kalbus et al. 2006, Dahl et al. 2007, Mencio and Mas-Pla 2008, Fleckenstein et al. 2010). As such indicator potential of diatoms for groundwater contamination by surface water has also been explored (Walker et al. 2005).

Ecological assessment of water bodies by diatoms has been accomplished by several approaches, few of them being the applications of diversity indices, autecological indices, and predictive models (Stevenson and Smol 2003; Stevenson et al. 2010). The European Union’s Water Framework Directive (WFD, No.60/2000; European Union 2000) recommended evaluation of the ecological status of water bodies by the analyses of the biological elements of these aquatic ecosystems and establishment of reference conditions. Various predictive models have been developed to predict these biological reference conditions (Chessman et al. 1999, Tison et al. 2007, Passy 2009) which are supposed to prevail in the absence or near absence of human disturbance. Predictive models have also been used the other way round where diatom communities were used for the prediction of environmental variables (Kovacs et al. 2006, Ponader et al. 2007, 2008, Wang et al. 2009). Autecological biotic indices however seem to the most popular approach as is evident from the significant increase in related research publications in recent years (Rimet 2012).

Several biotic diatom indices for water quality assessment in rivers have been used, most of which were developed in Europe, Japan, and the USA such as DES (Descy 1979), IPS (Cemagref 1982), SLA (Sladecek 1986), WAT (Watanabe et al. 1986), CEE (Descy and Coste 1991), SHE (Schiefele and Schreiner 1991), TDI (Kelly and Whitton 1995), IBD (Lenoir and Coste 1996), EPI-D (Dell’Uomo 1996; Dell’Uomo and Torrisi 2011), Rott. sap (Rott et al. 1997), Rott. trop (Rott et al. 1998), and IDP (Gomez and Lucursi 2001). Majority of these indices are based on the weighted average equation (Zelinka and Marvan 1961) and have shown their efficacy to monitor eutrophication (Schiefele and Schreiner 1991; Kelly and Whitton 1995; Rott et al. 1998), organic pollution (Sladecek 1986; Descy and Coste 1991; Rott et al. 1997), and salinity (Ryves et al. 2006).

The cosmopolitan nature of diatoms along with the available ecological information has allowed the diatom indices to be applied and tested in several regions of the world such as in Malaysia (Maznah and Mansor 2002), Morocco (Fawzi et al. 2002), Turkey (Gurbuz and Kivrak 2002; Kalyoncu et al. 2009a, b), Himalayas (Juttner et al. 2003), Australia (Newall and Walsh 2005; Dela-Cruz et al. 2006), East Africa (Bellinger et al. 2006), Kenya (Ndiritu et al. 2006), Vietnam (Duong et al. 2006, 2007), Iran (Atazadeh et al. 2007), South Africa (Taylor et al. 2007b; Walsh and Wepener 2009), China (Tan et al. 2013a, b; Yang et al. 2015), and New Zealand (Bere 2015), which are geographically very distant and different from the areas where they were formulated. These index applications indicated strong correlations between physical and chemical parameters and diatom indices (Kwandrans et al. 1998; Taylor et al. 2007b; Tan et al. 2013a, b). In all cases, even if these diatom indices and diatom tolerances were developed and defined in very different regions (e.g., Europe, USA, Japan) from those where they were tested, pollution assessment results have been reported to be good which suggested the global applicability of these indices (Rimet 2012). There is, however, evidence that diatom indices developed in one geographic area or environments are less successful when applied in other areas (Pipp 2002) and may cause uncertainty in results (Lobo et al. 2015). For example, weak correlations of environmental variables and diatom indices have been observed in Estonian rivers (Vilbaste et al. 2007). This is due to the floristic differences among regions (Taylor et al. 2007a) and the environmental differences that modify species responses to water quality characteristics (Potapova and Charles 2002). The range of environmental variables, against which the indicator values of diatom taxa have been computed originally, may have a significant effect on the performance of indices in other regions (Vilbaste et al. 2007; Tan et al. 2013b).

In India, the use of diatoms in water quality assessment has been dismally neglected. Although, diatom research in India can be traced long back to nineteenth century with the pioneering work of Ehrenberg (1845), most of the research has been conducted primarily on the taxonomy of diatoms (Karthick and Kociolek 2011). Studies on periphytic diatom assemblages of lotic ecosystems have been mostly confined to the Himalayan region (Nautiyal et al. 1996a; Nautiyal et al. 2000; Nautiyal and Verma 2009) and have focused on the taxonomical aspects of diatoms and their distribution in space and time. Though, diatoms have been globally established as successful biomonitors, yet in India very few studies have linked diatoms with water quality (Juttner et al. 2010, 2003; Venkatachalapathy and Karthikeyan 2012, Nautiyal et al. 2015). There is a heavy dearth of data on ecological preferences and tolerance range of diatoms from this region of Asia. Countries with a similar scenario often apply and test the applicability of existing foreign indices. However, apart from the application of trophic diatom index (TDI) to detect eutrophication in Himalayan streams (Juttner et al. 2003) and using van Dam values for river Mandakini (Nautiyal et al. 2015), foreign diatom indices have rarely been tested in Indian rivers and streams.

India is the second most populated country in the world and comprises of approximately 17.5% of the world’s population (Census 2011). Urbanization, massive industrialization coupled with extensive agriculture has led to severe stress on the quality and quantity of water in India. Mitigation strategies for pollution abatement are difficult to implement in India like most developed countries (Leung et al. 2013). The Central Pollution Control Board (CPCB) along with pollution control boards at the state level has established a nationwide network of water quality monitoring stations. The recorded water quality data takes into account chiefly the physical and chemical parameters while biological monitoring is rarely emphasized upon (Trivedi et al. 2008). Apart from the fact that biomonitoring integrates and reflects the true ecological status of water bodies, their cost effectiveness is an important aspect in a developing country like India and hence needs immediate consideration.

The present study was thus undertaken with the intent to reiterate the importance of biological monitoring of aquatic ecosystems and to bring forth the importance of diatoms as a bioassessment tool to the water managers and conservationists in India. The immediate objectives were to (1) prepare a checklist of diatoms and record associated environmental variables from Chambal, a river which is highly significant from the view of biological diversity and (2) to test the applicability and efficacy of several existing foreign diatom indices for water quality evaluation of this river and establish its ecological status.

The river Chambal offers two evident ecological scenarios (a) the upper segment draining the urban, agricultural, and industrial areas and (b) the lower segment which flows through the National Chambal Sanctuary (NCS) and is under few anthropogenic impacts. Sampling sites were hence selected accordingly.

Materials and methods

Study area

The Chambal River is the largest tributary of the Yamuna River in Central India and is thus a part of the greater Gangetic drainage system. It is a 960-km (600 miles)-long perennial river which originates from the summit of Janapav hill of the Vindhyan range at an altitude of 854 m above the msl at 22° 27′ N and 75° 37′ E in Mhow, located in Madhya Pradesh of Central India. It flows through three large states of India namely Madhya Pradesh (M.P.), Rajasthan and Uttar Pradesh (U.P.). The Chambal River has three major tributaries, namely the Parbati, Kali Sindh, and Banas rivers. Between 1960 and 1972 four multipurpose dams, namely Gandhi Sagar, Jawahar Sagar, Ranapratap Sagar, and Kota Barrage have been constructed on the upper reaches of the Chambal River which have affected its flow considerably (Hussain and Choudhury 1992).

The riverine ecosystem of the Chambal River supports a great diversity of species of plants and animals including six critically endangered, 12 endangered, and 18 vulnerable species, as categorized by the IUCN Red List of Threatened Species (Nair and Chaitanya 2013). These include the Gharial (Gavialis gangeticus), red-crowned roofed turtle, (Hardella thurjii), and the Gangetic river dolphin (Platanista gangetica gangetica). Acknowledging the rich biodiversity of the region, the Government of India declared a 5400-km2 (2100 sq mi) protected area in northern India as the National Chambal Sanctuary (NCS) which is presently administered jointly by the states of Uttar Pradesh, Madhya Pradesh, and Rajasthan (Whitaker 2007). The NCS is a long narrow eco-reserve which takes in approximately 400 km of the River Chambal which cuts through mazes of ravines and hills with many sandy beaches. The sanctuary is protected under India’s Wildlife Protection Act of 1972.

The Chambal River is considered to be relatively unpolluted and is one of the last remnant rivers in the greater Ganges River system, which has retained significant conservation values (Hussain and Badola 2001). However, since the last few decades, the river is under severe anthropogenic pressures. Hydrological modifications due to dams and diversion of huge volumes of river water for irrigation has resulted in reduced river flow and erratic water releases leading to inundation of nesting sites of several endangered species (Hussain and Badola 2001). Illegal sand mining and unrestrained fishing have accentuated the perils to conservation. Moreover, the river flows past some major industrial cities with massive human settlements where it undergoes tremendous pressure from the discharges of untreated domestic and industrial waste.

Sampling sites

Twenty-seven sites (S1 to S27) were selected along the Chambal River which offered different ecological scenarios (Fig. 1). Sites S1 to S15 were located in the upper river basin which is under urban, agricultural, and industrial impacts. Two major cities Nagda and Kota are situated along the river in its upper reaches. Nagda is an industrial city with many textile and chemical industries. The city of Kota has a prominent industrial set up and supports a population of more than one million. At many locations, the domestic and industrial effluents are directly discharged in the river which renders it severely polluted. Sampling sites along these two cites (S1, S2, S3, S13, S14, and S15) are under the impact of severe pollution. The rest of the river in the upper segments flows through small cities and villages where agriculture is practiced in certain areas. Sites located in these parts are under the impact of moderate pollution (S4 to S12).

Fig. 1
figure 1

A Location of the Chambal River in India. B Location of the selected sites of the Chambal River

The lower stretch included sites (S16 to S27) of the National Chambal Sanctuary (NCS) region. The characteristic feature of this part is the undulating deep ravines caused by the erosive action of the river and rivulets. In general, relatively pristine conditions are observed in this stretch with sparse human population and settlements. The area has more than 25% of forest cover, typically known as ravine thorn forest (Hussain and Badola 2001). The abbreviations used for heavy polluted sites, moderately polluted sites, and pristine sanctuary sites are HVPL, MDPL, and SANT, respectively.

Environmental variables

Samples of river water were collected from 27 selected sites along the Chambal River during December–January 2016. At all sites temperature, pH, conductivity, turbidity, salinity, dissolved oxygen, and total dissolved solids were measured in situ by the use of a multi-parameter probe (Horiba U-23). The analysis of nitrate (NO3), nitrite (NO2), orthophosphate (PO4), and silicate were conducted in the laboratory using a UV/VIS double beam spectrophotometer (UV-1700). The chemical analyses including biological oxygen demand (BOD5) and chemical oxygen demand (COD) were performed in accordance with APHA (2005) guidelines.

Diatom sampling and laboratory analysis

Diatoms were collected from all the 27 sites along with the river water samples in December–January 2016. The time period was chosen to avoid immediate post flood conditions and it ensured that the dams were open, for the entire river to be in a free flowing state. As epilithic diatoms are the favored community for monitoring water quality (Kelly et al. 1998) and almost all methods based on diatom indices concentrate on this community (Round 1993), and efforts were made to collect diatoms from cobbles and pebbles after ascertaining the fact that river water was continuously flowing over the substrata and that the substrata were at a distance from the river bank, thus preventing the building up of a local chemical environment (Kelly et al. 1998). For the sampling of epilithic diatoms, five to ten cobbles or pebbles were randomly collected from each sampling site and diatoms were scraped off with a toothbrush following standard procedures (Kelly et al. 1998). Prior to the sampling of epilithic surfaces, all substrata were gently shaken and the resulting suspensions were pooled to form a single sample, which was then put in a labeled plastic bottle. From four sites, namely S12, S16, S19, and S25, diatoms were scraped and collected from submerged macrophytes along with the epilithon. Substrata were almost absent from S1, S2, and S3 sites and samples were very difficult to collect. However, we managed to collect 1–2 cobbles from each of these sites. All diatom samples were homogenized and fixed with 4% formaldehyde. In the laboratory, diatoms samples were cleaned with hot HCl and KMnO4 to remove organic coatings. This method is based on Hasle (1978) and adapted by Round et al. (1990). It has been found suitable for cleaning diatom samples collected in India (Karthick et al. 2010). Permanent slides were prepared using Naphrax (Brunel Microscopes Limited; Refractive index of 1.64).

The identification and counting of taxa were carried out under a light microscope (Leica DM750) at a 100× magnification using immersion oil in accordance with CEN standards (2001). More than 800 diatoms frustules were counted for each slide for the computation of relative abundances of species and calculation of diatom indices. For ensuring taxonomic accuracy, SEM was performed with a Carl Zeiss EVO 18 at AIAE, Amity University, Noida, India.

Identification were made according to the standard literature such as Hustedt (1931–1959), Krammer and Lange-Bertalot (1986, 1988, 1991a, b, 2004), Gandhi (1998), Metzeltin and Lange-Bertalot (2002), Krammer (2003), Lange-Bertalot et al. (2003), Werum and Lange-Bertalot (2004), and Karthick et al. (2013). The software OMNIDIA 8.1 (Lecointe et al. 1993) was used to calculate 17 diatom indices. Most of the diatom indices are based on the weighted average equation (Zelinka and Marvan 1961).

Data analysis

Pearson correlation was used to determine the relationship between the calculated index scores and measured physical and chemical water quality data using SPSS software (version 17). For all indices, multiple regression analysis was also used to explore which combination of physical and chemical variables best explained the observed variation in the diatom index.

Multivariate analysis

Multivariate analysis was performed on the diatom and environmental data sets. A principal component analysis (PCA) was performed to reveal the relationship between environmental variables and associated sampling sites using the software Community Analysis Package (CAP5, Pisces Conservation Limited, 2014). All of the environmental variables, except the pH, were log10 transformed to normalize their distribution before analysis.

Detrended correspondence analysis (DCA) was performed on the diatom community data to determine the length of the gradient. The gradient length was greater than 3 standard deviation units, which suggested the use of unimodal ordination techniques to be appropriate (ter Braak 1986). Canonical correspondence analysis (CCA) was hence used to reveal the relationship between diatom assemblage composition and environmental variables. DCA and CCA were performed by the software CANOCO (version 4.5) (ter Braak 1986).

For the calculation of different diatom indices, the diatom species counts were entered into the diatom database program OMNIDIA version 8.1 (Lecointe et al. 1993) and the following indices were calculated and tested: saprobity index (SLA; Sladecek 1986), Descy’s pollution index (DES; Descy 1979), Schiefele and Schreiner’s index (SHE; Schiefele and Schreiner 1991), Watanabe index (WAT; Watanabe et al. 1986), trophic diatom index (TDI; Kelly and Whitton 1995), Commission for Economical Community index (CEE; Descy and Coste 1991), pollution sensitivity index (IPS; Cemagref 1982), biological diatom index (IBD; Lenoir and Coste 1996), eutrophication/pollution index (EPI-D; Dell’Uomo 1996), Swiss diatom index (DI-CH; BUWAL 2002), Pampean diatom index (IDP; Gomez and Lucursi 2001), biological water quality index (LOBO; Lobo et al. 2004), Rott. trophic index (TID; Rott et al. 1998), and Rott. saprobic index (SID; Rott et al. 1997). In order to facilitate comparisons among the results, the program OMNIDIA automatically transformed them into a scale from 0 to 20, independent of the scale in which they had been expressed.

Results

Environmental variables

The values of physical and chemical parameters along with mean range and standard deviations at 27 sites in the present study are shown in Table 1. A large amount of variation was observed in the environmental variables particularly BOD (0.12 to 18.08 mg/L), COD (2.0 to 68.35 mg/L), nitrate (0.35 to 7.35 mg/L), silica (6.5 to 22.3 mg/L), and TDS (0.20 to 0.0.55 g/L). High BOD and COD values were observed in the HVPL sites with highest values at S13. At the same site, the peak value of phosphate (0.677 mg/L) was recorded. Nitrate values were substantially higher in six sites, three of which were from the MDPL group while the other three were located in the sanctuary zone. The pH fluctuated within a narrow range of 6.95 to 8.96.

Table 1 The mean values and range of measured physical and chemical variables of selected sites from the Chambal River (n = 27)

A PCA was performed on the whole data set with respect to 13 environmental and results obtained (Fig. 2). A scree plot was used to visually assess which components or factors explained most of the variability in the data. The first principal component (PC1) had an eigenvalue of 5.485 which accounted for 42.19% of the variance while the second principal component (PC2) explained 23.46% (eigenvalue: 3.05) of the total variation (Appendix I). The PCA performed evidently clustered three groups of sites. The HVPL sites were segregated from MDPL and SANT sites along the PC1. The MDPL and SANT sites displayed more variance from each other along the PC2 and were thus separated along this axis. The length or magnitude of vectors (environmental variables) suggested a strong correlation with respect to ordination of sites.

Fig. 2
figure 2

Resultant graph from PCA carried out on physical and chemical variables of selected sampling sites (only PC1 and PC2 depicted)

The variables such as COD, BOD, phosphate, and turbidity had high loading values respectively and contributed the most to the variance on the PC1, thereby grouping the most polluted sites together on the left quadrant whereas the moderately polluted and NCS sites were seen clustering on the right side of the graph. This axis could be considered depicting a gradient of organic pollution and nutrients particularly phosphate and silicate. The PC2 differentiated the MDPL sites (S4 to S12) and the SANT (S16 to S27) sites along a gradient of dissolved salts, pH, nitrate, nitrite, and DO, each of which had high loading values on this axis. The MDPL sites were spread out the most signifying varying degree of pollution and trophic levels. Dissolved oxygen and pH seemed to be quite significant for the sanctuary region as the vectors of these variables showed strong association with the sites. An evident negative correlation was observed between DO and temperature with vectors in diametrically opposite directions reiterating the inverse relationship of solubility of gases and temperature.

A water quality index (WQI) was calculated by using the National Sanitation Foundation Water Quality Index (NSFWQI) which has been frequently used for water quality assessments (Chaturvedi and Bassin 2010). Six environmental variables were chosen for the calculation of WQI, namely pH, DO, BOD, turbidity, nitrate, and phosphate.

The mean values of the calculated WQI for HVPL, MDPL, and SANT site groups were 58.33, 78.22, and 68.25, respectively. A broad range of index values (51–81) was observed for SANT sites whereas narrow ranges were seen for HVPL (53–62) and MDPL (74–80) sites.

Diatom community composition

A total of 100 taxa belonging to 40 genera were identified in benthic diatom samples collected during the present study. Diatom community composition varied between sites and between groups (Fig. 3). List of the recorded diatom taxa along with their codes has been given as Appendix II.

Fig. 3
figure 3

Column graph depicting proportion of diatom abundant taxa (> 5%) recorded from HVPL, MDPL and SANT sites from the Chambal River. Diatom taxa codes are given in Appendix II

SANT sites were dominated by oligosaprobic and oligo-mesosaprobic taxa (van Dam et al. 1994) such as Brachysira vitrea (Grunow), Achnanthidium minutissimum (Kϋtzing), Synedra rumpens (Kϋtzing), Achnanthidium petersenii (Hustedt), A. minutissimum var jackii (Rabenhorst), Navicula cataracta-rheni (Lange-Bertalot), and Navicula cryptotenella (Lange-Bertalot). B. vitrea (Grunow) was the most dominant diatom recorded from all the pristine SANT sites with an average abundance of 29.52%. It is an oligosaprobic taxa (van Dam et al. 1994) which prefers undisturbed regions with high water quality. Nitzschia acicularis was also dominant at several sites of SANT group. However, it was also recorded (9.19% average relative abundance) from all the sites of heavily polluted group. It is α-mesosaprobic taxa (van Dam et al. 1994) and has been known to have preference for eutrophic waters. N. cryptotenella (Lange-Bertalot) was abundant in several sites including S17 which was one of the most undisturbed sites of this group. However, it was also abundant in sites S1, S2, and S3 of HVPL group. Sites of MDPL group displayed a considerable degree of variation with respect to general pollution and trophic levels (as depicted by PCA) and were dominated by β-mesosaprobes such as A. minutissimum (Kϋtzing), Gomphonema angustatum (Rabenhorst), and Achnanthidium biasolettiana (Grunow). However, Synedra tabulata (Kϋtzing) and Cyclotella meneghiniana (Kϋtzing) which are α-mesosaprobic taxa were also found to be dominant from S7 to S12 sites of this group.

Heavily polluted sites (HVPL) with poor water quality were mostly dominated by eutrophic and pollution tolerant species such as Nitzschia amphibia (Grunow), Synedra ulna (Ehrenberg), N. acicularis (Kϋtzing), and N. cryptotenella (Lange-Bertalot). Sites downstream of the city of Kota were the most polluted and diatom communities were dominated by N. amphibia (Grunow) with relative abundance as high as 47.04% at S14. As already stated, substrata were almost absent from S1, S2, and S3 sites and diatoms collected from an occasional pebble or cobble may not be representative of the sampled site. Diatom taxa from these sites included N. cryptotenella (Lange-Bertalot), A. minutissimum (Kϋtzing), Achnanthidium exiguum (Grunow), and Cyclotella stelligera (Grunow).

Cluster analysis (complete linkage) was performed on 100 diatom taxa from the selected 27 sites. The dendograms of the sampling sites based on relative abundance are given in Fig. 4. The formation of two major groups was observed: all the moderately polluted to heavily polluted sites constituted group 1 while group 2 consisted of all the pristine sanctuary sites. Group 1 further subdivided into 1a and 1b. Group 1a clustered all the HVPL sites whereas group 1b was constituted from MDPL sites. In group 2, the least impacted sites (S16, S17, and S18) were clustered together. Thus, similar grouping of sites was observed in cluster analysis and PCA.

Fig. 4
figure 4

Results of cluster analysis (complete linkage) based on benthic diatom community data sampled at 27 sampling sites from the Chambal River

CCA showed significant relationship (p < 0.01) on all axes. The first four axes accounted for 56.2% of the total variability. Diatom taxa were closely associated with environmental variables (Fig. 5). Most of the abundant taxa were indicators of good to moderate water quality and were closely associated with the vectors of pH, DO, nitrate, and conductivity. Only few species such as N. amphibia, Gomphonema exilissimum, and A. exiguum were associated with positive values on axis 1, on which the values of BOD (0.754), COD (0.702) , and phosphate (0.555) were high indicating an organic and nutrient gradient. Species such as G. angustatum, Gomphonema sphaerophorum, S. tabulata, and Cocconeis pediculus were associated with positive values on axis 2 and negatively correlated with EC, TDS and SALT (with correlation coefficient of − 0.68, − 0.76, and − 0.84, respectively). Diatom taxa of sites S7, S8, and S9 were not associated with any vector and were grouped on the top left quadrant of the CCA graph.

Fig. 5
figure 5

Canonical correspondence analysis (CCA) biplot showing the relationship between measured environmental variables and dominant diatom taxa (> 5%) recorded from the Chambal River. Acronyms are presented in Appendix II

Relationship between diatom indices and water quality

Diatom indices values were calculated for all the 27 sites. Pearson correlations were made between diatom index scores and selected environmental variables (Table 2). Significant correlations (p < 0.01) were observed between most of the environmental variables and diatom indices. TDI significantly correlated with most environmental variables while LOBO with the least. Temperature, silica, nitrite, and salt did not correlate significantly with most index scores. Comparison of correlation coefficients showed variability among the three groups, namely HVPL, MDPL, and SANT sites. Stronger correlations were observed between most variables such as BOD, COD, and nutrients in HVPL and MDPL as compared to SANT sites. In general, low correlation between environmental variables and index values were seen in SANT sites (data not given) where only a few variables such as temperature and silica were significantly correlated (p < 0.01). Strong and significant correlations were also observed between diatom indices (Table 3). Pearson correlations were also applied on diversity indices (Fisher’s alpha and Shannon indices) and environmental variables. Diversity indices were negatively correlated with most of the variables including BOD and pH (Table 2).

Table 2 Pearson correlation coefficients between measured environmental variables and diatom index scores generated from the selected sampling sites of the Chambal River
Table 3 Pearson correlation coefficients between diatom indices

Stepwise multiple regression analysis was performed on the index scores and environmental variables. Variables which showed high correlation with diatom indices were used to formulate the regression equation. The performance assessment of developed regression models is presented in Fig. 8. Among 17 diatom index predictor models (multiple regression analysis), BOD, COD, and phosphate were good predictors of most of the diatom indices except LOBO and IDP. DO showed significant (p < 0.01) results with IPS, IBD, IDSE, SHE, DI-CH, and LOBO. Adjusted R 2 values ranged from 0.20 to 0.82, thereby explaining 20 to 82% of the original variability (Fig. 7) and were high for most index scores (> 60%). The closer adjusted R 2 is to 1, the better is the model and its prediction for the dependent variable (index scores in this case). The lowest adjusted R 2 values were observed for IDP, LOBO, IDAP, and TID. Indices such as WAT, GENRE, IDSE, CEE, SLA, and SID had highest values of adjusted R 2. EPI-D, IBD, and SHE were observed to be in the medium range.

The average percentage of diatom species used by the OMNIDIA software for the calculation of different indices has been depicted in Fig. 6. High percentages of diatom taxa were included for the calculation of IPS and IGD scores, whereas least number of taxa was used by WAT index (Fig. 7).

Fig. 6
figure 6

Average percentage of diatom species used in the calculation of different indices scores in this study (n = 27). Index abbreviations as in Fig. 2

Fig. 7
figure 7

Adjusted R 2 values obtained from forward stepwise multiple regression analysis performed on index scores and measured water quality variables. Index abbreviations as in Fig. 2

Most of the indices showed a clear tendency for index values to increase with improving water quality and vice versa. Evident differences in the index values were observed for HVPL, MDPL, and SANT site groups (Fig. 8). IPS, IBD, TDI, EPI-D, IGD, WAT, and SLA, all assigned high values to the SANT sites with good water quality and low values to HVPL sites with poor water quality. The MDPL group consisted of sites with varying degree of pollution and trophic levels. Index scores of IPS, IBD, and TDI had a broader range for this group. Narrow ranges particularly for MDPL sites were observed for IGD, EPI-D, and SLA index scores (Fig. 9). Water quality maps of the Chambal River in accordance with trophic diatom index (TDI) and specific pollution sensitivity index (IPS) were prepared (Fig. 10) and water quality classes were allotted in accordance with Eloranta and Soininen (2002).

Fig. 8
figure 8

Regression analysis graphs showing predicted vs observed values for selected diatom indices

Fig. 9
figure 9

Box plots of diatom index values for HVPL, MDPL, and SANT sites. Index values transformed on the scale from 0 to 20

Fig. 10
figure 10

Water quality map of the Chambal River in accordance with a trophic diatom index (TDI) and b specific pollution sensitivity index (IPS)

Discussion

Diatoms have been used for assessment of environmental conditions in rivers and streams for more than a century (see Stevenson et al. 2010). Biotic diatom indices are known to summarize and quantify information provided by the diatom assemblages and have been used worldwide for the assessment of water quality particularly with reference to eutrophication and organic pollution (Coste in Cemagref 1982; Sladecek 1986; Watanabe et al. 1986; Descy and Coste 1991; Schiefele and Schreiner 1991; Kelly and Whitton 1995; Dell’ Uomo 1996; Lenoir and Coste 1996; Prygiel and Coste 2000) and have extensively been used in the European countries. They have been also designed to diagnose other stressors such as heavy metals or acidity (Sabater 2000).

These indices have also been used for bioassessment in countries which were not only distant but also dissimilar from the regions where these indices were formulated and were found to be quite successful in water quality estimations (Kwandrans et al. 1998; Bellinger et al. 2006; Taylor et al. 2007b; Bere 2015). In the present study, their application for bioassessment of river Chambal in India yielded promising results. Significant correlations were seen between environmental variables and most of the index scores in the present study. Variables such as DO, BOD, COD, and phosphate were found to be highly correlated with almost all index scores reiterating the robustness of diatom biomonitoring in a far off country like India. Temperature and silica concentrations did not correlate significantly with most of the index scores. Similar observations were made by Taylor et al. (2007b) and Tan et al. (2013b). Correlations were stronger in MDPL and HVPL sites while weaker correlations were observed between environmental variables and indices in the pristine SANT sites. Similar weak correlations have been observed in relatively pristine or less polluted areas (Vilblaste et al. 2007; Bere 2015). However, index scores suggested high water quality class for these sites, which corroborated well with high levels of DO and low levels of BOD and phosphate values. Most diatom indices have been developed with the intent of monitoring organic pollution and eutrophication and are thus based on associated variables such as TOC, BOD, ammonia, and phosphate. These variables commonly used to determine the pollution status and have low levels and narrow range in rivers and streams with high water quality. In such conditions, other variables such as DO and pH may become important for structuring diatom communities and consequently determining the index scores. Some of the index scores showed significant correlations with turbidity, pH, and silica in SANT sites. Probably in future, diatom indices could be developed to monitor pristine water bodies with high water quality.

Multiple regression analysis has been used to demonstrate the relationship between diatom indices and a combination of water quality variables (Lenoir and Coste 1996; de La Ray et al. 2004; Newall and Walsh 2005; Bere and Tundisi 2011). High adjusted R 2 values suggest that most of the variation in index scores can be accounted for by the selected environmental variables or can be interpreted as better performance of indices with respect to an integrative reflection of water quality. In the present study, adjusted R 2 values computed from stepwise regression performed on index scores and environmental variables were high (> 60%) for almost all indices and thus most of the variation in index scores could be explained by the measured water quality variables. Adjusted R 2 values as high as 80% for IGD, 63% for TDI, and 65% for SPI for the Chambal River is encouraging and suggests the universal applicability of these indices. These results compare well with that of regression models developed in countries like Europe, South Africa, and Brazil (Lenoir and Coste 1996; Taylor et al. 2007b; Bere and Tundisi 2011). The high explanatory powers of the developed regression models indicate the high efficacy of these indices in the Indian scenario. The high and significant Pearson r correlations, along with the high adjusted R 2 values, suggest the suitability of diatom indices for the reflectance of true water quality in this region.

Based on correlation coefficients, adjusted R 2 values and percentage taxa included in the computation of index scores, six popular diatom indices were selected for further discussion, namely, TDI, IPS, IGD, IBD, EPI-D, and SLA. CEE index was omitted for further discussion as it correlated most strongly (p < 0.01) with IBD, EPI-D, and SLA and, hence, similar results by the application of CEE were expected.

Majority of indices applied including TDI, IPS, IGD, IBD, EPI-D, and SLA were successful in identification of polluted and unpolluted sites. All the SANT sites were categorized as oligo (class I) to oligo-meso (class II) water quality status, thereby indicating excellent to good water quality. Most of the MDPL sites were assigned water quality class of II and III depicting oligo-meso to mesotrophic condition. HVPL sites were categorized from eutrophic (class IV) to hypertrophic (class V) by most indices.

SLA is a saprobic index and was formulated to measure the organic enrichment in waterbodies (Sladecek 1986). It showed a strong positive correlation with the BOD values in the present study. WAT index was developed in Japan (Watanabe et al. 1986) and uses 548 diatom taxa. This index has been successfully applied in neighboring countries such as China (Tan et al. 2013b). Similar to the results of original study, WAT showed a strong correlation with BOD and had the highest adjusted R 2 values. However, these two indices were unable to distinguish between polluted and unpolluted sites, sometimes assigning similar values to both. This inability was primarily because of the low percentage of taxa used by these indices for computation of index scores and consequent allotment of water quality class. SLA and WAT included only 59 and 20% of the taxa respectively for calculation of scores.

In comparison, indices based on global sensitivity of diatoms such as IPS, IBD, and EPI-D gave better results with a clear tendency of index values to increase with improving water quality and vice versa. IPS has the broadest species base with 4000 taxa and was formulated to evaluate the general water quality by integrating organic pollution, salinity, and eutrophication (Prygiel and Coste 1993). It is one of the globally and most frequently used diatom-based indices in European and non-European countries (Lavoie et al. 2009) and has been applied successfully to assess river water quality in countries such as Vietnam (Duong et al. 2006, 2007), Poland (Kwandrans et al. 1998), Portugal (Almeida 2001), Slovakia (Solak and Acs 2011), China (Yang 2015), and in Mediterranean rivers (Goma et al. 2004). In the present study, excellent results were given by IPS which used on an average 93% of the taxa and showed a significant correlation with water quality variables. It efficiently discriminated between sites as score values had a wide range even in same groups.

IBD (Prygiel and Coste 2000) has been extensively used for stream water quality monitoring program in France. Fourteen environmental parameters are associated to 209 “key taxa” (Coste et al. 2009). This index effectively distinguished between impacted and pristine sites of the Chambal River. However, IBD gave a high score for HVPL sites which had substantially high organic load and elevated trophic levels. Similar observations have been made in rivers of Mediterranean basin (Goma et al. 2004), Austria (Rott et al. 2003) and Poland (Szulc and Szulc 2013) where IBD overestimated the water quality. Goma et al. (2004) have related this overestimation of water quality by IBD to the fact that this index overvalues some small members of family Naviculaceae. Out of the selected indices in the present study, IBD was the only index that displayed a low r-Pearson correlation coefficient (significant at p < 0.05) with the BOD and COD values which were the most important variables in defining the HVPL sites (as evident from PCA) and could have thus led to erroneous index scores.

Another popular index is EPI-D (Dell’ Uomo 1996) which has given good results not only in Italian Mediterranean rivers (Dell'Uomo 1999; Torrisi and Dell Uomo 2006) but also in far off countries as China (Tan et al. 2013b). This index is based, above all, on the high sensitivity of diatoms to organic matter, nutrients and mineral salts dissolved in water, particularly chlorides (Torrisi and Dell’Uomo 2006). In the Chambal River, EPI-D performed reasonably well. It was significantly correlated with variables related to organic pollution and eutrophication such as BOD, COD, and phosphate concentrations whereas salt concentration and conductivity did not correlate significantly with the index values. Though EPI-D index scores discriminated between HVPL, MDPL, and SANT sites, yet index values fell within a narrow range for MDPL sites, which had varying degree of pollution and trophic levels. Martin et al. (2010) recommended the use of EPI-D for the fact that the calculation of this index does not require discrimination between certain problematic Achnanthidium species thereby making it easier to use from a taxonomical point of view. In the present study, Achnanthidium species were most abundant in MDPL sites and were collectively grouped under the A. minutissima sensu lato by EPI-D. Assigning similar sensitivity and tolerance values to different Achnanthidium species may have resulted in a narrow range of index scores for moderately polluted sites.

IGD (Rumeau and Coste 1988; Coste and Ayphassorho 1991), which integrates the general pollution status, works at a genus level of identification and hence is one of the simplest index to use (Taylor et al. 2007b). Bioassessment using diatoms at species and genus level have been compared and results derived from generic level have been quite robust (Hill et al. 2001; Wunsam et al. 2002; Raunio and Soininen 2007). It has been observed that taxonomic resolution has little influence on diatom assemblage structure description, with little ecological information being lost when resolution is decreased from species to order level (Rimet and Bouchez 2012). In the present study, IGD significantly correlated with nutrients and organic matter and responded well to the concentration gradients of these variables. In a country like India where diatom taxonomy and related ecology is poorly understood, IGD may be a valuable tool for rapid biomonitoring programs and subsequent building up of databases of ecological profiles of diatoms from this region of the globe. IGD has been known to give good results for the reflection of water quality in various countries including Poland and France (Kwandrans et al. 1998; Solak and Acs 2011).

TDI is one of the very few indices that have been applied and tested in India and consistent responses in the TDI between Europe and the Himalaya have been observed (Juttner et al. 2003). TDI was found to be very successful for assessment of eutrophication in the river Chambal. It was significantly positively correlated with phosphate concentrations, the variable against which the index was developed, suggestive of its applicability in India. It effectively established the trophic status of the Chambal River discriminating between sites. Moreover, the percentage pollution tolerant valves in this study were lower that 20% in all the sites, thereby indicating high reliability of this index as an estimate of eutrophication (Kelly and Whitton 1995). It has been observed that where both the gradient to the pollution and the diatom assemblages is similar to that in the original study, the index does perform well (Tan et al. 2013b). In the present study, the range of phosphate values were quite similar to that which was used for the formulation of TDI. This index included approximately 71% of the taxa (most of abundant species included) for the computation of scores.

Low indices values were observed for three sites of MDPL group, namely S7, S8, and S9 sites particularly by IPS, IBD, and TDI which assigned them a water quality class of IV. IBD identified them as the most polluted sites allocating poor quality class. IPS values at these sites were very low, almost equal to the values computed for most polluted sites of HVPL group. These low index values indicating bad water quality did not corroborate with the physico-chemical variables which indicated mild to moderate pollution. It is noteworthy that in the CCA analysis the diatom taxa from sites S7, S8, and S9 were not found to be associated with any environmental variable and this diatom group was segregated on the top left quadrant of the CCA graph. It is interesting to note that during the sampling along these three sites we came to know that many illegal explosive manufacturing factories had been set up which discharge their effluents intermittently in the river. There was also an unusually high occurrence (> 70%) of serious liver and renal ailments in the riparian human community which utilizes the river water directly for bathing and drinking purposes. It is well-known that biological monitoring has been established for identifying problems otherwise missed or underestimated by chemical monitoring (Karr and Yoder 2004). It is quite probable that the changes in diatom communities of these three sites were induced either by pollutants which were not selected in the present study or were not picked up by our physical and chemical data due to their intermittent nature.

Diversity indices have been traditionally used in the monitoring and assessment of freshwater ecosystem health (Blanco et al. 2012). It has been observed that diversity decreases with increasing pollution as only the tolerant taxa are able to sustain themselves with increasing pollution gradients (Archibald 1972, Patrick 1973). In our study, significant negative correlations were seen between values of Shannon index and BOD, COD, reiterating the fact that diversity decreases with increasing pollution. However, values of Fisher’s alpha index did not significantly correlate with these variables. Positive relationship between increasing pollution and diversity has also been reported (Lavoie et al. 2008). It has also been observed that the diversity may change differently with the type of pollution (Juttner et al. 1996). Hence, diversity indices are often considered unsuitable for water quality assessments (Bellinger et al. 2006, Blanco et al. 2012). In the present study, low diversity was observed at HVPL sites whereas MDPL sites displayed high species diversity. Several studies have reported high species diversity and evenness in moderately polluted water bodies which are considered to harbor dominant species of either good or polluted water (van Dam 1982).

The National Sanitation Foundation (NSF) Water Quality Index was developed in 1970 (Brown et al. 1970) and has been frequently used for water quality assessment (Chaturvedi and Bassin 2010). Application of WQI resulted in the classification of all the sites as having either good or moderate water quality and thus compared to diatom indices displayed less efficacy in the discrimination of sites. The most polluted sites (S1 to S3 and S13 to S15) were classified as having moderate water quality by the index. Similarly, several sites from the least polluted group (S17, S21, S22, S24, and S27) were also designated same water quality class by the WQI. However, similarity with reference to assignment of ecological status was observed between WQI and biotic indices such as many SANT sites were allocated good ecological status by both types of indices.

Ecological preferences of most of the taxa conferred with what is recorded and established worldwide for example N. amphibia, which is favored by very high nutrient concentration and tolerant to very heavy pollution, were clearly associated with the most polluted sites. Similarly, B. vitrea, an alkaliphilous, oligosaprobic taxa which is very sensitive to pollution, was the most abundant species recorded from the pristine SANT sites. B. vitrea is included in the list of 159 “reliable” taxa which are not sensitive to regional setting, water type and taxonomic uncertainty (Besse-Lototskaya et al. 2011) and hence can serve as dependable indicators. Its high abundance from the undisturbed sanctuary sites with high water quality seems to corroborate with the “reliability” of this taxon.

Chambal is regarded as one of the few pristine rivers in India (Hussain and Badola 2001). The river still remains unpolluted for most of its stretch (Saksena et al. 2008) and the recorded diatom communities seem to confirm this. Most of the abundant diatom taxa were oligosarobes and β-mesosaprobes (van Dam et al. 1994) that indicated good to high water quality. CCA revealed that these taxa were closely associated with the vectors pH, DO, nitrates, and conductivity. Only a few taxa such as N. amphibia, G. exilissimum, and A. exiguum were associated with the vectors of increasing organic content and phosphate concentrations.

A. minutissimum was recorded from all sites, from most polluted to least disturbed, and thus contributed to assignment of water quality class, through index score computation. A. minutissimum has been established as one of the most abundant and frequently occurring taxa in freshwater benthic samples from all around the globe (Patrick and Reimer 1966; Krammer and Lange-Bertalot 1991a; Potapova and Hamilton 2007). It is ubiquitous taxa with a broad ecological spectrum (van Dam et al. 1994) and has been recorded from environments with varying degree of organic pollution and trophic levels (Potapova and Hamilton 2007). On one hand, it has been known to be tolerant to severe “chemical insults” (Stevenson and Bahls 1999) while on the other it has been observed to indicate good water quality (Prygiel and Coste 1998) and nutrient poor water bodies (Kelly and Whitton 1995).The identification of Achnanthidium and related taxa was a challenging task in the present study. Potapova and Hamilton (2007) reported the difficulties in establishing taxonomical and ecological differences among A. minutissimum morphotypes, even when abundant information was available. However, scanning electron microscopy facilitated the identification process thereby helping us distinguish between closely related forms such as A. biasolettiana (Grunow) and A. petersenii (Hustedt) which may have different ecological preferences. A. biasolettiana is β-mesosaprobic (van Dam et al. 1994), indicating good to moderate water quality (Kobayasi and Mayama 1982) whereas A. petersenii is oligosaprobic (van Dam et al. 1994), reflecting good water quality. Indices such as EPI-D and GDI (works at generic level) which do not require differentiation of Achnanthidium species (Martin et al. 2010) could be suitable for this region simply for the purpose of ease of identification. However, caution should be exercised where Achnanthidium species are abundant. Lumping of taxa with very dissimilar ecological preferences, such as A. saprophilum which indicates bad water quality (Kobayasi and Mayama 1982) and A. petersenii which reflects good water quality, may lead to computation of erroneous index scores. According to Round (2004) lumping of several similar looking taxa into one “morphospecies” diminishes discriminative ability of diatom indices, while detailed taxonomic and ecological studies allow recognition of taxa with good indicator properties. It is noteworthy that index scores of EPI-D and GDI fell within a narrow range for MDPL sites, which had varying degree of pollution and trophic levels, whereas broad range of values were observed for TDI, IPS, and IBD for the same sites. Achnanthidium chitrakootense, possibly an endemic species, has been reported from the rivers of Northern and Central India (Wotjal et al. 2010). It was found to be dominant (> 5%) at several SANT sites of river Chambal but was not used by any index calculation. Few of the taxa encountered in this study which remain unidentified and require further taxonomic work could well be endemic. When endemic taxa are abundant, water quality may be misinterpreted (Taylor et al. 2007b) and these should be included in indices reference list for proper estimation of water quality. The development of a regional index inclusive of such endemic taxa could enhance accuracy of bioassessment of rivers and streams of this region.

Selection of substrata has been known to have a significant effect on diatom community composition (Lenoir and Coste 1994) thereby affecting index scores (Kahlert and Rašić 2015). However, there is much controversy in the literature about the relationship between the substrate type and the composition of diatom assemblages, and its influence on water quality assessments (Besse-Lototskaya et al. 2006). For example, differences in the results of index calculations based on epilithic and epiphytic diatom communities from a particular site were observed by Eloranta and Andersson (1998). Similarly, Kelly et al. (1998) observed large differences in diatom communities from different substrata in lowland streams, leading to differences in TDI index values. On the other hand, Bere and Tundisi (2011) reported that benthic diatom communities from different substrates sampled on the same site were generally similar and recommended the collection of only one substrate to be collected at each site for multivariate-based water quality assessment surveys. In the present study, though epiphytic diatom communities differed from the epilithon yet almost similar index scores (leading to same water quality class) were computed from both types of substrata from the same site. Differences in Index scores could have been pronounced if more epiphytic samples were available and hence more comparisons could be made.

Conclusion

Water quality evaluation of the Chambal River in India by the application of foreign diatom indices yielded promising results. Though applied to a very distant and dissimilar region, most of the popular diatom indices such as IPS, TDI, BDI, EPI-D, and IGD displayed strong correlations with environmental variables and successfully identified and segregated polluted and unpolluted sites. Best results were obtained for TDI and IPS indices which showed a high level of resolution with respect to discrimination of sites on the basis of pollution gradients. These indices were successful in determination of the ecological status which concurs well with the physical and chemical data already available for the Chambal River. The fact that the ecological preferences of most of the diatom taxa conferred with what is recorded and established worldwide is suggestive of the universal applicability of these indices. In India, where biomonitoring techniques are rarely emphasized upon, the present study reiterates the utility of diatom-based assessment approaches to stakeholders for river monitoring and conservation. Nevertheless, ensuring taxonomic accuracy would be a challenging task for this region and as such indices which require coarser taxonomic skills may be utilized for rapid bioassessment purposes. In the future, regional indices inclusive of endemic taxa may be developed for enhancing accuracy in diatom-based water quality evaluation. It is thus concluded that TDI and IPS are the quite applicable and efficient indices for biomonitoring of rivers of Central India. Indices which are simpler to use as IGD may also be considered at least for a coarse evaluation of water quality.