Introduction

Acid mine drainage (AMD) sites are globally distributed highly acidic extreme environments associated with mining activities (Akcil and Koldas 2006). AMD is generated primarily through microbially catalyzed oxidation of metal sulfides, such as iron sulfides, in the presence of air and water (Kuyucak 2002). Although the process involves several steps, it is often modelled through the oxidation of pyrite (FeS2), as given by Eq. 1:

$${\text{FeS}}_{2} + 7/2{\text{O}}_{2} + {\text{H}}_{2} {\text{O}} \to {\text{Fe}}^{2 + } + 2{\text{SO}}_{4}^{2 - } + 2{\text{H}}^{ + } .$$
(1)

Overall, the process generates high amounts of sulfuric acid which reduces the solution pH and promotes the dissolution of metals (Kuyucak 2002). Typically, AMD is characterized by excessive amounts of total dissolved solids, pronounced mineral acidity, and low pH (Akcil and Koldas 2006). These physical and chemical characteristics represent an extreme environment which maybe intolerable to most life forms. Nonetheless, it has been established that AMD is populated by distinct microorganisms (mostly prokaryotes) endowed with novel cellular mechanisms adaptive of the harsh conditions (Johnson and Hallberg 2003). AMD environments have been a subject of intensive microbial ecology studies (Chen et al. 2016). Owing to the low diversity of AMD microbial communities and geochemical simplicity, AMD environments are perceived as ideal model systems for discerning the microbial ecology, processes, and interaction networks in natural microbial assemblages (Baker and Banfield 2003; Denef et al. 2010; Kuang et al. 2016). In light of the role of some AMD inhabitants in its natural attenuation, these microbes are perceived to be perfect candidates for limiting AMD production as well as promoting the bioremediation of existing sites (Johnson and Hallberg 2005). However, a deeper understanding of the role of individual AMD microbes in driving nutrient cycles is necessary for successful biotechnological exploitation of AMD microbes. This can be achieved in part through describing the diversity and structure of AMD microbial communities as well as their variation in space at time and across environmental gradients (Hallberg 2010; Volant et al. 2014).

In the last decade, molecular techniques using Next-Generation Sequencing (NGS) platforms have taken center stage in microbial ecology (Shendure and Ji 2008). These are high-throughput techniques that allow a robust sampling of microbial diversity, enabling a comprehensive scrutiny of broad trends of microbial distribution (Liang et al. 2017). Accordingly, the applications of NGS platforms have greatly improved our perspectives on the structure and diversity of environmental microbial communities (Caporaso et al. 2011). Particularly, high-throughput targeted sequencing of phylogenetic marker genes such 16S rRNA has emerged as an efficient, fast, and low-cost technique, for probing microbial diversity (Logares et al. 2012). In this present work, the diversity and taxonomic composition of planktonic and benthic bacterial communities from an AMD containment dam were characterized using amplicon gene sequencing targeting the 16S rRNA gene.

Materials and methods

Sample collection

Water and sediment samples were collected from the Lancaster Dam (S 26° 07 39.9′, E 27° 46 46.2′) located in Randfontein in the Westrand of the Gauteng Province of the Republic of South Africa (RSA). The Randfontein area is well known for its long history of mining activities. The Lancaster Dam is a natural dam that was initially a source of pristine water. However, over the years, it has been infiltrated by acidic mine water emanating from nearby mine tailings and mine wastes generated by several gold mines in the area. To date, the dam is filled with acidic water.

Sampling in the Lancaster Dam was done over a period of 5 months between July and November 2017. Three sample collection points were identified around the dam based on ease of accessibility. From each sampling site, 5 L of water was collected into sterile polypropylene bottles and later pooled into one composite sample representative of each sampling occasion from which DNA was extracted. An additional 1 L of water was collected from each site for chemical analysis. At each sampling site sediments were collected from a 10 cm depth into 15 mL centrifuge tubes. These were later pooled and homogenized to a composite sample and used for DNA extraction. Sediments for geochemical analysis were collected into separate centrifuge tubes. All samples were kept on ice (4 °C) and transported to the laboratory for further processing within 48 h.

Analysis of water physicochemical parameters

Physicochemical parameters, including pH, temperature (T), total dissolved solids (TDS), salinity (SAL), and electrical conductivity (EC), were measured in situ using a multi-probe field meter (YSI ™ 6 series, Sonde Marion, Germany). In the laboratory, water samples for chemical analysis were prefiltered through Porafil 0.45 µm, 47 mm cellulose acetate membrane filters under negative pressure. Filtered samples for metal analysis were acidified to pH 2 using 70% nitric acid and stored at 4 °C until analysis. Elemental analysis was performed in an Inductively Coupled Optical Emission Spectrometer (Agilent Technologies 700 series ICP-OES). Anions (sulfates, nitrates) and chemical oxygen demand (COD) concentrations were measured spectrophotometrically from non-acidified pre filtered water using a Spectroquant Pharo 300 (Merck, RSA) as per the manufacturer’s instructions. Dissolve organic carbon (DOC) was measured in a torch combustion carbon analyzer (Teledyne Tekmar, USA).

Geochemical analysis of sediments

Sediment samples were dried in a benchtop vacuum freeze-dryer (Labconco, USA). Dried samples were homogenized by grinding in a pestle and mortar before passing through a 200-mesh sieve. Prior to metal analysis powdered sediment samples were digested using a microwave (SINEO MDS-6G, SepSci, RSA) following a protocol provided by the manufacturer. Samples were weighed (0.5 g), placed in microwave vessels and mixed with 9 mL nitric acid and 3 mL hydrochloric acid (37%). Digestion was performed at 175 °C for 60 min at 6 Watts power. After digestion, samples were allowed to cool and centrifuged at 5000×g for 10 min. The supernatant was collected and quantitatively made up to 50 mL with de ionized water and the total metals were quantified from this solution as described for water samples. For physicochemical analysis, 2 g of dry sediment was suspended in 10 mL deionized water and shaken at 150 rpm for 4 h. The suspension was allowed to settle overnight at room temperature. The liquid part was carefully decanted, and used to measure pH, sulfates, nitrates, and DOC as described for water samples (Sun et al. 2016).

Genomic DNA extraction, amplification, and sequencing

Pooled water samples for DNA extraction were pre filtered through Macherey–Nagel 1.2 µm, 47 mm glass fiber membrane filters to remove suspended particles. Microbial biomass was harvested through filtration in Pall Supor 0.22 µm, 47 mm polyethersulfone membrane filters under vacuum. The membranes were stored in sterile Petri dishes and kept at − 20 °C until DNA extraction. The filters were used as templates for DNA extraction using a D’Neasy PowerWater kit (Qiagen, Germany) following the manufacturer’s instructions. DNA from sediments was extracted from 9 g of pooled sediment sample using the D’Neasy PowerMax Soil kit (Qiagen, Germany) following the manufacturer’s instructions. Extracted DNA was quantified on a Qubit 3.0 Fluorometer (Life Technologies, RSA) and the purity was determined by measuring the A260/280 and A260/230 ratios in a Biodrop µLite spectrophotometer (Biochrom, USA). Polymerase chain reactions (PCR) were performed in triplicate to amplify the bacterial 16S rRNA gene using the universal primer pair 27F (5′-TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AGA GTT TGA TCM TGG CTC AG-3′) and 518R (5′- GTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAT TAC CGC GGC TGC TGG–3′) with adapters. This pair of primers amplifies the bacterial V1–V3 region of the 16S rRNA gene sequence and produces 500 bp amplicons (Caporaso et al. 2011). PCR reactions were performed with a 50 µL reaction mixture containing 25 µL Qiagen Top Taq Master Mix (2.5 units Taq DNA polymerase, dNTPs (200 µM each), 1.5 Mm MgCl2), 1 µL each of forward primer and reverse primer, metagenomic DNA template (50–100 ng/µL), and sterile nuclease-free water added to make up the final reaction volume of 50 µL. PCR reactions were performed in a BioRad T100 thermal cycler using the following cycling conditions: initial denaturation step at 94 °C for 5 min, followed by 30 cycles of denaturation at 94 °C for 1 min, annealing at 55 °C for 30 s, and extension at 72 °C for 1 min 30 s, with a final extension at 72 °C for 10 min. To monitor contamination, a template-free control was included in every reaction. The PCR products were loaded onto a horizontal gel of 1% agarose in 1× TAE buffer stained with ethidium bromide. Gels were run at 80 V for 90 min in 1× TAE buffer. After electrophoresis, gels were viewed under a Biorad UV illuminator. Band size and authenticity was confirmed by comparison to a CSL-MDNA-100BP DNA ladder. Authentic PCR products were pooled together for each respective sample. The PCR products were sent for sequencing at the Agricultural Research Council Biotech Platform (Pretoria, South Africa). Sequencing was done on the Illumina Miseq using the paired-end sequencing chemistry.

Sequence data analysis

Prior to analysis, the raw sequence data sets in .fastq format were trimmed to remove the Illumina tags, PCR artefacts, and poor quality reads. Following the initial quality filtering, all data sets were uploaded onto the Mothur pipeline v.1.40.0 for further analysis (Schloss et al. 2009). During the course of analysis, sequence reads with < 50 nucleotides, > 2% of ambiguities or 7% of homopolymers and those with mitochondrial or chloroplast origins were discarded. Similarly, chimeric sequences were removed using default parameters in UCHIME (Edgar et al. 2011). Non-chimeric sequence reads (490 nt) were subsequently aligned against the SILVA 16S rRNA gene reference database (version 128) at 80% confidence threshold (Quast et al. 2013). The aligned data sets were clustered into Operational Taxonomic Units (OTUs) at 97% sequence similarity using a pairwise distance matrix. Dominant OTUs were further subjected to a BLAST search to compare their identities with sequences in the NCBI-BLAST tool (Johnson et al. 2008). Alpha diversity among the 7 sequence data sets was inferred from the number of observed OTUs per sample, as well as alpha non-parametric indices including Chao_1, Dominance, Shannon_H, Simpson_D, and Species evenness (Zakrzewski et al. 2017). All non-parametric diversity indices were calculated at a genetic distance of 0.03. The relative abundance of individual taxa was expressed as a percentage of the number of sequences affiliated with the specific taxon against the total number of sequences obtained for that sample. In addition, similarities and differences in sediment and water bacterial communities were inferred from principal coordinate analysis (PCoA) of weighted UniFrac distances obtained through QIIME (Lozupone and Knight 2005). A canonical correspondence analysis (CCA) was also performed in the Paleontological Statistics Software (PAST v 2.17) to identify possible correlations between bacterial community structure and diversity with physicochemical parameters (Hammer et al. 2001). CCA was done on abundant bacterial genera and selected environmental parameters (pH, DOC, sulfates, nitrates, and metal ions).

Predictive functional profiling

The functional profiles of the bacterial communities were predicted from the 16S rRNA gene sequence data set using PICRUSt (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) (Langille et al. 2013). The OTU relative abundance data were mapped against genes available in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The resultant data were then normalized by dividing each OTU by the known 16S copy-number abundance. The normalized data were used as input for the metagenome predictions and subsequent collapse into functional pathways (Langille et al. 2013).

Nucleotide Sequence Accession numbers

Sequences generated in this study have been deposited with the NCBI under project number PRJNA550971.

Results

Physicochemical profiles of water and sediment samples

The water samples collected from Lancaster Dam are characterized by remarkably low pH (2.33–2.93), elevated electrical conductivity (5743–6400 µS cm−1), chemical oxygen demand (142.74–167.50 mg L−1), and sulfate concentrations (571–1573 mg L−1). The sediment samples were also acidic, although slightly less acidic than the water samples (pH 3.07–3.35). These also depicted high levels of sulfates with concentrations ranging between 1267 and 2775 mg kg−1. Other physicochemical parameters of water and sediment samples are presented in Tables 1 and 2, respectively. Dissolved metals in water samples were quantified in the ICP-OES and the concentrations are presented in Fig. 1. Aluminium was by far the most prevalent dissolved metal quantified in the water samples with concentrations above 500 mg L−1. This was followed by iron, magnesium, sodium, and manganese all with concentrations above 100 mg L−1. Other metal ions detected in substantial amounts are zinc, calcium, potassium, and copper. Chromium and silver were measured in trace amounts (≤ 1 mg L−1). In the sediment samples, iron was detected as the most abundant metal with concentrations ranging between 35,619 and 65,314 mg kg−1. Aluminium, calcium, magnesium, potassium, and manganese were also detected in substantial amounts (above 500 mg kg−1), whilst zinc, chromium, copper, and silver were detected in trace amounts.

Table 1 Physicochemical properties of water samples collected from Lancaster Dam
Table 2 Physicochemical properties of sediment samples collected from Lancaster Dam
Fig. 1
figure 1

Metal concentrations in water (a) and sediment (b) samples collected from Lancaster Dam

Bacterial community diversity

After quality filtering and trimming, a total of 66,914 sequence reads were obtained from the 7 sequencing libraries. A higher number of sequences were generated from water samples (7212–15,396 reads) compared to sediment samples (6612–10,851 reads). Rarefaction curves (Fig. S1) drawn from the raw sequence data indicate that the samples were sequenced at different depths, but at a genetic distance of 0.03, bacterial diversity was adequately covered for all of them. The sequence read tags were assigned to different OTUs at a 97% similarity threshold and a total of 1836 OTUs were recovered from the 7 sequence libraries. Despite the high number of sequence reads observed in water samples, they recorded the lowest number of OTUs (201–286) compared to (218–356) from sediment samples. This points towards a higher species richness in the sediment compared to the water bacterial community. The observed trend is further corroborated by the non-parametric alpha diversity indices (Dominance_D, Chao_1, Shannon_H, Simpson_D, and Species evenness) calculated (Table 3). These all indicate a significant shift (p < 0.05) in the species richness and evenness between water and sediment bacterial communities. For sediment samples, the Chao-1 index values ranged between 225.1 and 365.1 compared to 215.8 and 291.2 in water samples. Similarly, the Dominance values were almost close to zero (0.028–0.043) in sediment samples indicating a high degree of species evenness in the community. For water samples on the other hand, Dominance values are significantly higher (0.347–0.431) indicative of the predominance of a few OTUs. In accordance, the Shannon, Simpson, and Species evenness indices were higher in sediment than water samples.

Table 3 Non-parametric alpha diversity metrics calculated for water and sediment samples collected from Lancaster Dam over five sampling dates

Compositional similarities and dissimilarities between the bacterial communities in the two Lancaster Dam microhabitats were further determined from a PCoA of weighted UniFrac distances constructed from OTU count data. The PCoA plot (Fig. 2) shows a clear segregation of the water and sediment bacterial communities. Bacterial communities from water samples collected at different sampling dates clustered along the second axis, whilst bacterial communities from the sediment samples are distributed over the first and second quadrants of the plot in line with the heterogeneity observed in their physicochemical profiles. Furthermore, an Unweighted Pair Group Method with Arithmetic mean (UPGMA) tree generated from weighted UniFrac distance matrix with 100% support at all nodes (Fig. S2) shows a similar pattern of clustering for the two microbial communities.

Fig. 2
figure 2

Principal coordinates analysis (PCoA) plot showing clusters of bacterial communities based on weighted UniFrac distance matrix of species richness. The first axis explains 72.9% of the observed variation in bacterial community diversity, whilst the second axis explains 13.2%

Phylogenetic structure of bacterial communities

Of the total sequence, reads recovered over 99% were classified under Kingdom bacteria and approximately 1% remained unclassified. The recovered sequences fall into 22 phyla, 45 classes, 73 orders, 122 families, and 132 genera. At phylum level, the overall sequence reads were classified into 8 dominant phyla (relative abundance ≥ 1% in at least 1 sequence library) namely; Acidobacteria, Actinobacteria, Chloroflexi, Firmicutes, Nitrospirae, Proteobacteria, Saccharibacteria, and ca. TM6_(Dependentiae). The list of recovered phyla is presented in Table S1. In both the water and sediment sequence libraries, Proteobacteria was the most abundant phylum with higher abundance in water (72.74–83.43%) compared to (30.31–52.16%) in sediments. In agreement with the trend noted with the non-parametric indices, there are clear differences in the diversity of water and sediment bacterial communities (Fig. 3). In the water samples, 4 additional phyla, Actinobacteria (4.0–11.74%), Firmicutes (5.7–8.9%), ca. TM6_(Dependentiae) (1.49–4.28%), and Saccharibacteria (0.1–2.02%) had relative abundances greater than 1%. On the contrary, in sediment bacterial communities, 6 additional phyla were dominant; Saccharibacteria (7.80–21.27%), Actinobacteria (14.20–17.57%), Acidobacteria (10.10–18.83%), Chloroflexi (2.57–8.67), Nitrospirae (1.56–7.62%), and Firmicutes (0.45–3.28%).

Fig. 3
figure 3

Taxonomic distribution of bacterial communities from sediment and water samples collected from Lancaster Dam. The circos graph shows only taxa detected at average relative abundance > 1% in at least one sequence library at phylum level

At class level, a majority of the bacterial community was distributed across 12 major classes: three classes of Proteobacteria (Alphaproteobacteria, Betaproteobacteria, and Gammaproteobacteria), Acidimicrobiia, Acidobacteria, Actinobacteria, Bacilli, Clostridia, ca. JG37-AG-4, Nitrospira, Saccharibacteria, and ca. TM6_(Dependentiae). The relative abundance of each class is variable across sediment and water samples, as shown in Fig. S3. Both the water and sediment bacterial communities are largely composed of Alphaproteobacteria with a higher abundance in water (69.63–78.07%) than sediments (24.04–39.36%). Other notable differences between sediment and water samples are observed in the relative abundance of Bacilli, Clostridia, Betaproteobacteria, and ca. TM6_(Dependentiae) which are all highly enriched in water samples and almost non-existent in sediments. Furthermore, Acidimicrobiaa, Actinobacteria, Saccharibacteria, Acidobacteria, Nitrospira, Gammaproteobacteria, and ca. JG37-AG-4 are significantly enriched in sediment than in water bacterial communities.

More than 90% of the bacterial sequence reads were confidently classified to genus level and were distributed in about 15 major genera. A heatmap showing the variation in the abundance of the top genera amongst the 7 sequence libraries is shown in Fig. 4. The most prominent genera are; Acidiphilium, Acidobacterium, Acidothermus, Acidibacter, Bacillus, unclassified ca. JG37-AG-4, Metallibacterium, Mycobacterium, Legionella, Leptospirillum, Propionibacterium, Saccharibacteria, and unclassified ca. TM6_(Dependentiae). In both the water and sediment communities, Acidiphilium spp. is dominant with the relative abundance ranging between 68 and 76% in water and 7–19% in sediments. Again, there are remarkable dissimilarities in the water and sediment bacterial community structure. Water bacterial communities are particularly enriched with sequences related to Mycobacterium (2.44–5.24%), ca. TM6_(Dependentiae) (1.48–4.48%), Legionella (1.21–2.03%), Propionibacterium (1.26–1.23%), and Acidibacillus (1.28–1.35%) which all occur in lower proportions in the sediments. On the other hand, sediment bacterial communities depict high abundances of sequences related to Saccharibacteria (7.8–21.27%), Metallibacterium (2.07–11.06%), Acidobacterium (3.73–8.99%), ca. JG37-AG-4 (2.19–8.67%), Acidibacter (1.01–4.60%), Leptospirillum (1.56–7.62%), and Acidothermus (1.95–2.46%). In addition, a substantial amount of sequences with high similarity to unclassified Acidobacteriaceae (2.74–9.91%), Acetobacteraceae (2.02–2.86%), Acidimicrobiaceae (1.55–5.56%), as well as uncultured bacteria (1.45–7.98%) are notably enriched in the sediments.

Fig. 4
figure 4

Heat map showing the distribution of the most abundant genera in water and sediment bacterial communities. Relative percentage values for the bacterial genera are indicated by color intensity

Effects of physicochemical parameters on the bacterial community structure

Canonical correspondence analysis (CCA) was performed to ascertain possible correlations between physicochemical variables and the community structure as well as individual dominant taxa (genera). For the water bacterial communities’ CCA plot (Fig. 5a), the first axis explained 77.73% of the bacterial community–environment relationship. This axis showed a positive correlation with DOC, Mn, Mg, Cu, Cr, Na, Al, and Fe and negative correlations with K, Zn, Ag, Ca, pH, sulfates, and nitrates. The second axis explained 22.27% of the variability and correlated positively with K, Zn, Ag, Ca, pH, sulfates, and nitrates, and negatively with DOC, Mn, Mg, Cu, Cr, Na, Al, and Fe. In accordance with the PCoA analysis, the three water samples clustered together at the centroid of the triplot, implying that they are all equally affected by the environmental parameters. With reference to correlation networks, the abundant genera Acidiphilium, Legionella, and Acidibacter depict a co-occurrence, and all show a positive correlation with DOC, sulfates, Mn, Mg, Cu, Cr, Na, Fe, Al, and Ca. Mycobacterium and ca. TM6_(Dependentiae) as well co-occur and show positive correlations with nitrates, pH, Ag, Zn, and K. Notably, those taxa that occur in low abundance; Saccharibacteria, Leptospirillum, ca. JG37_AG_4, Metallibacterium, Acidobacterium, and Acidothermus are scattered on the margin of the plot showing no co-occurrence with other taxa or correlations with any of the physical parameters.

Fig. 5
figure 5

CCA biplot showing correlations between a water and b sediment bacterial community structure and physicochemical parameters. The black circles indicate the microbial communities. Red dots indicate individual genera. Green lines represent physicochemical parameters associated with bacterial community structures. The position of each genus in relation to that of a physicochemical parameter is a measure of the correlation between the two. Sac, Saccharibacteria; Lept, Leptospirillum; Acidib, Acidibacterium; Leg, Legionella; Acido, Acidithermus; Acidip, Acidiphilium; Myco, Mycobacterium; Meta, Metallibacterium; Acidib, Acidibacter

For the sediment bacterial communities’ CCA plot (Fig. 5b), the first axis explained 59.83% of observed dissimilarity depicting positive correlations with Al, Ca, Cu, Fe, Zn, Mg, Mn, DOC, sulfate, and nitrates. The second axis explained 22.08% of the variability and is positively correlated with Ca, Cu, Zn, Mg, K, Al, Cr, Na, DOC, sulfates, and nitrates. In the sediments, Ca appears to be the strongest determinant of microbial community structure. Nonetheless, the four sediment samples were grouped at the center of the biplot, implying that there is no dissimilarity on how each of them is shaped by the environmental parameters. The genus Acidimicrobium and ca. TM6_(Dependentiae) co-occur and are positively correlated with most of the environmental parameters: DOC, nitrates, sulfates, and the metals Ca, Cu, Zn, Fe, Mg, K, Mn, and Al. The second cluster of Acidothermus, Saccharibacteria, Mycobacterium, ca. JG37_AG_4, Leptospirillum, and Metallibacterium depicts a co-occurrence and positive correlations with pH, Ag, and Cr. Notably, the two genera Acidiphilium and Acidibacter show some co-occurrence and together with Legionella do not depict any clear relationships with any of the physicochemical parameters tested.

Predictive functional profiling of bacterial communities

The metabolic function of bacterial communities in water and sediments was predicted using PICRUSt and a heatmap showing the variation in the expression of the predominant functions was generated (Fig. 6). Across all 7 metagenomes, the recovered functions can be classified broadly into two groups: those that are associated with the normal general functioning of prokaryotic cells and those that are associated with mechanisms related to survival and resistance to the harsh environmental conditions in AMD. Putative functions associated with normal functioning include: general function predictive only, metabolism (amino acid, butanoate, pyruvate, pyrimidine, chlorophyll, purine, and methane), biosynthesis (chromosome, ribosomes, peptidases, RNA, and bacterial motility proteins), carbon fixation, and glycolysis.

Fig. 6
figure 6

Heat map showing the differential expression of functions in water and sediment bacterial communities of Lancaster Dam recovered from PICRUSt analysis

Recovered functions that may be deemed crucial for survival in the harsh environmental conditions in AMD include DNA recombination, replication and repair, transporters, chaperons, and folding catalysts as well as the bacterial secretion system. Among the recovered functions, the top three most abundant are transporters, DNA recombination, replication, and repair as well as ABC transporters. All three functions are related to mechanisms of survival in the harsh conditions. Indeed, there is mounting evidence to the effect that AMD systems are populated by specialist microorganisms endowed with novel cellular mechanisms adaptive of the harsh conditions (Johnson and Hallberg 2003). Accordingly, there are reports on the overexpression of genes related to adaptation compared to those related to normal general metabolism in AMD systems (Ram et al. 2005).

Discussion

The physicochemical profile of water samples collected from Lancaster Dam is characteristic of AMD with high acidity, high electrical conductivity, chemical oxygen demand, and sulfate concentrations. Sediment samples as well showed high acidity and sulfate concentrations, suggesting that the prolonged exposure to AMD results in the deposition of large amounts of sulfur in sediments. Nonetheless, pH values in sediments were slightly higher than in the water. The direct effect of low pH in mine waters is the increase in the solubility and subsequently bioavailability of metals (Akcil and Koldas 2006). Indeed, metals species such as Al, Ag, Ca, Cd, Cr, Cu, Fe, K, Mg, Mn, and Zn were detected in both the water and sediments at levels comparable to those previously reported in other AMD systems (Kamika et al. 2016). In this particular site, the water and sediments showed different profiles of metal concentrations. For instance in water, aluminium was detected as the most abundant metal ion compared to iron in sediments. Furthermore, sodium was detected only in the water and not in sediment samples. Often times, sediments act as reservoirs for metal deposition, and may be characterized by a high metal content. However, transition of metals between the water column and sediments is inevitable with changes in several physicochemical variables such as pH, salinity, and redox potential (Balintova et al. 2012).

The low pH, coupled with high metal concentrations and limited availability of organic carbon in mine wastes, presents an extreme environment that is hostile for microbial growth (Dopson et al. 2003; Slonczewski et al. 2009). However, through culture based and culture independent studies, it has been established that AMD is populated by a specialist microbial community (primarily unicellular prokaryotes) despite the high content of pollutants (Johnson et al. 2005; Kuang et al. 2013; Falagán et al. 2014). In this study, the bacterial community structure of AMD was measured through high-throughput sequencing of 16S rRNA gene amplicons obtained from total genomic DNA extracted from water and sediments. The results are indicative of a low diversity sediment and water bacterial community which is common in these extreme environments. The recovered bacterial OTUs that are shared in the two microhabitats correspond to lineages previously found in other highly acidic and metal-rich environments (Kuang et al. 2013; Yang et al. 2014; Méndez-García et al. 2015; Gupta et al. 2017; Sajjad et al. 2018). These were affiliated within 8 major phyla; Acidobacteria, Actinobacteria, Chloroflexi, Firmicutes, Nitrospirae, Proteobacteria, Saccharibacteria, and ca. TM6_(Dependentiae). The phylum Proteobacteria was detected as the most abundant in both water and sediment bacterial communities. Members of this phylum are ubiquitous and have frequently been observed in extreme environments including those with extreme acidity such as mine water and tailings (Edwards et al. 2000; Bruneel et al. 2006; He et al. 2007; Kuang et al. 2013; Kamika and Momba 2014; Keshri et al. 2015; Kamika et al. 2016). Members of this phylum are endowed with physiological properties that enable them to adapt to extreme environmental conditions including low pH, high metal content, and low nutrient availability (Yergeau et al. 2012).

At genus level, the recovered taxon includes those that are renowned as AMD inhabitants and directly or indirectly linked to the carbon, iron, and sulfur cycles. Nonetheless, other genera that are rarely reported in AMD systems were also recovered suggestive of a tentatively distinctive or novel AMD bacterial community. Acidithiobacillus and Leptospirillum have been identified as key players in the generation of AMD (Chen et al. 2016) and they have been detected globally in AMD sites (Auld et al. 2013, 2017). However, in this study, Acidithiobacillus was not detected at all, whereas Leptospirillum was detected only in the sediments as one of the dominant genera. Both water and sediment bacterial communities showed a high prevalence of sequences related to Acidiphilium spp. Members of this genus are primarily heterotrophs that can also reduce iron (Baker and Banfield 2003). They are widely distributed in acidic metal-rich environments and are important in the maintenance of AMD by enhancing the growth of chemoautotrophs through their metabolism of low-molecular-weight organic compounds that are toxic to chemoautotrophs (Chen et al. 2016). Acidiphilium spp. show enhanced growth under conditions of elevated aluminium concentrations (Wakao et al. 2002). However, heterotrophic acidophiles are rarely reported as the dominant members in AMD bacterial communities (Johnson and Hallberg 2007). The occurrence of Acidiphilium as dominant members of a bacterial community was first reported in an AMD dam in China (Hao et al. 2010). The authors attributed the dominance of Acidiphilium and limited abundance of chemoautotrophs to low ferrous iron concentrations and high DOC levels. Under conditions of low ferrous iron concentration, the growth of iron-oxidizing chemolithotrophs may be restricted (González-Toril et al. 2003; Druschel et al. 2004; Rzhepishevska et al. 2005). Furthermore, surface AMD sites often receive excess DOC from soils and plant material washed off from the surroundings. This may promote the growth of heterotrophic bacteria. However, these may undergo photolysis or acid hydrolysis to form small molecular weight organic acids which are toxic to chemoautotrophs (Tittel et al. 2005). Although ferrous iron concentrations were not measured in this study, the paucity of chemoautotrophic acidophiles could be attributed to the high DOC concentrations as the Lancaster Dam does receive external carbon loads.

Other heterotrophic genera that were recovered include Acidibacter (Gammaproteobacteria), Acidobacterium (Acidobacteria), Metallibacterium (Gammaproteobacteria), Acidothermus (Actinobacteria) and the newly described genera Acidibacillus (Firmicutes). Acidibacter, and Acidobacterium spp. are well distributed in acidic environments as minor members of bacterial communities. The recovery of sequences with close similarity to Metallibacterium spp. is in line with previous studies in which members of this genus were detected as dominant members of the community in several acidic metal-rich environments (acid streamers, sediments, water, biofilms) (Kay et al. 2013, 2014; Brantner and Senko 2014; Brantner et al. 2014; González-Toril et al. 2014; Sun et al. 2015, 2016; Falteisek et al. 2016; Gupta et al. 2017). This genus is represented by a sole species, M. scheffleri first isolated from acidic biofilms and characterized as being able to oxidize Fe(II) as well as reduce Fe(III) (Ziegler et al. 2013). The enrichment of this genus in the sediments is not surprising as it has previously been reported that surface attached growth appears to select for the preferential growth of M. scheffleri and related organisms, which is reflected by their abundance in biofilms. The occurrence of Acidothermus in the sediments is puzzling. The only cultured organism from this genus A. cellulolyticus first isolated from Yellowstone acidic hot springs displayed a thermophilic growth, yet water temperatures at Lancaster Dam were around 20 °C throughout the sampling period (Barabote et al. 2009). This could suggest the existence of mesophilic members within this genus or the sequences may have arised from sequencing errors (Huse et al. 2010).

Although AMD microbes are known to remain recalcitrant to in vitro cultivation (Aguilera and Johnson 2016), techniques using high-throughput NGS platforms have the added advantage of being able to capture both the cultivable and non-cultivable members within a community (De Mandal et al. 2015). In this study, sequence reads affiliated with three uncultivated phyla often referred to as the “dark matter”; Saccharibacteria, ca. TM6_(Dependentiae), and ca. JG37-AG-4 were detected in high abundance. Sequence reads with high similarity to the phylum Saccharibacteria (previously candidate division TM7) were recovered in high abundance in sediment samples. The phylum is currently described from 16S rRNA gene sequences and genome data only. It represents a highly ubiquitous group of bacteria distributed in soils, seawater, sludge, animal and human sources, as well as clinical environments (Ferrari et al. 2014). Most importantly, some members of this phylum have been recovered in acidic metal-rich environments such as acidic soils (Ferrari et al. 2014; Winsley et al. 2014) acid water (García-Moyano et al. 2015) as well as oxic and suboxic biofilms (Méndez-García et al. 2014). Due to a lack of cultured isolates and the small number of 16S rRNA gene sequences in reference databases, the metabolic traits, general biology of Saccharibacteria, and hence functions in AMD remain elusive (Méndez-García et al. 2014; García-Moyano et al. 2015; Mesa et al. 2017). Moreover, sequences related to the uncultured phylum ca.TM6_(Dependentiae) were detected (more enriched in water samples than sediments). 16S rRNA sequence signatures from this phylum have been found in diverse habitats including sulfur hot springs, arsenic-rich sediments, and AMD-contaminated creek sediments (Youssef et al. 2012; Escudero et al. 2013; Sun et al. 2016). Initially, sequenced genomes of members within this phylum were suggestive of a facultative anaerobic metabolism characteristic of free living organisms (McLean et al. 2013). However, recently constructed genomes are characterized by a small genome size, AT bias, and lack of biosynthetic pathways, pointing towards a symbiotic lifestyle (Yeoh et al. 2016). Another yet-uncultured taxa that was identified to be dominant in the bacterial community is ca. JG37-AG-4 from Chloroflexi which has previously been detected in uranium waste piles and the Río Tinto sediments (Selenska-Pobell et al. 2001; García-Moyano et al. 2012).

The results in this study also portray quite a different microbial community compared to other AMD sites, as sequences related to Mycobacterium and Legionella spp. were recovered with their relative abundance greater than 1%. Both genera occur in higher abundance in the water. The occurrence of Mycobacterium has been reported in aquatic and soil environments often as minor members (Falkinham 2002). Reports of the occurrence of Mycobacterium spp. in acidic environments remain limited thus far. Their distribution has been reported in moderately acidic forest soils (pH 4) (Iivanainen et al. 1997) and recently in an endolithic bacterial community at Yellowstone Geothermal sites as the dominant taxa (Walker et al. 2005). Their role in acidic environments remains poorly understood, although they are postulated to be heterotrophs (Walker et al. 2005). Sequence reads classified as Legionella spp. were recovered from water samples albeit in low abundance (≤ 2%). Members of the genus Legionella are neutrophilic organisms responsible for the disease Legionnaires Pneumonia in humans. Although they have recently been detected in AMD systems, their occurrence still remains unusual (Hao et al. 2012; Auld et al. 2013, 2017). Previous workers have opined that Legionella occurs in AMD as endosymbionts of AMD amoeboid protists (Auld et al. 2017). The same can be assumed in this study, although the eukaryotic community was not profiled.

Variation in the bacterial community in water and sediment samples

At all taxonomic levels, the sediment and water-column bacterial communities shared most of the recovered OTUs. This implies that most of the taxa inhabiting the water column can also be tracked in the sediments. This observation is in agreement with a report in which intimate associations among bacterial populations residing in different microhabitats (water, sediments, and biofilms,) at Río Tinto were noted (García-Moyano et al. 2012; Falteisek et al. 2016). However, the relative abundance of the shared bacterial populations varies between the sediment and water communities. Sediment bacterial communities show a significantly higher species richness than water bacterial communities. Higher bacterial diversity in sediments might be attributed to the higher pH and hence less extreme environmental conditions (Sánchez-Andrea et al. 2011; García-Moyano et al. 2012; Méndez-García et al. 2014). A substantial proportion (up to 7%) of sequences from sediments were affiliated with yet uncultivated bacterial clades suggesting possible untapped bacterial diversity in the sediments. Studies on the characterization and profiling of microbial communities from sediments in AMD environments remain this far limited as compared to the water column and biofilms (Sánchez-Andrea et al. 2011; Sanz et al. 2011). However, it is clear that these microhabitats may serve as niches for enormous bacterial diversity and hence the need to be considered in trying to understand the integrated microbial ecology of these peculiar extreme systems.

Physicochemical parameters affect bacterial diversity

Free living organism often shows non-random distribution patterns across diverse habitats at various spatial and temporal scales (Liu et al. 2014). However, environmental parameters have also been shown to drive microbial community structure and diversity. Whereas, some studies have often identified individual parameters such as pH, TOC, or metal concentrations as key parameters structuring microbial diversity in AMD (Hao et al. 2010, 2017; Streten-Joyce et al. 2013; Yang et al. 2014; Sun et al. 2016; Auld et al. 2017) in this study, the tested physicochemical parameters showed an almost even effect on both the water and sediment bacterial communities. Notably, correlations of physicochemical parameters with individual genus were evident. In water bacterial communities, DOC, Mn, Cu, Cr, Al, Fe, and Ca appear to have correlations with most members of the community compared DOC, Ca, Cu, Fe, Zn, Mg, K, Mn, Al, sulfates, and nitrates in sediment bacterial communities. A clear demarcation between taxa that are positively correlated to low pH, and high metal concentrations (strict acidophiles) as well as those that prefer growth under moderate acidity and metal concentrations (moderate acidophiles) was evident. The different taxa within the sediment and water bacterial communities show some correlation networks that may be indicative of biotic or species–species interactions within the microbial community.

Predictive functional profile

Metagenomic sequencing of acidophilic microbial communities has revealed the predominance of gene clades involved in iron and sulfur oxidation, iron and sulfur reduction, carbon fixation, nitrogen metabolism, and resistance to the low pH and high metal concentrations (Huang et al. 2016). Similarly, in this study, PICRUSt analysis recovered functions related to mechanisms essential to the normal function in prokaryotic cells. Furthermore, functions associated with adaptation and survival in the harsh environment such as maintenance of a neutral intracellular pH, metal resistivity, and cellular protection were recovered in high abundance.

An abundance of putative functions associated with transport (general transport, ABC transporters, and ion-coupled transporters) was evident in both the sediment and water metagenomes. In acidophiles, the transport system has been identified as crucial for the maintenance of a neutral intracellular pH (Dopson et al. 2003). Particularly, transporters are used to pump out excess protons in the cytoplasm to maintain a neutral pH that is conducive for enzyme activity (Michels and Bakker 1985). Accordingly, genes encoding putative proton efflux systems have been recovered from the sequenced genomes of acidophiles such as Ferroplasma type II and Leptospirillum group II (Tyson et al. 2004). In addition, the transport system plays a role in pumping out excess metal ions within cells thus minimizing their toxicity (Dopson et al. 2014). For example, Acidithiobacillus ferroxidans survives high copper concentrations by forming copper-phosphate complexes intracellularly and pumping them out of the cell using the efflux pump systems (Alvarez and Jerez 2004). Furthermore, the water and sediment metagenomes depict an abundance of putative functions associated with bacterial secretion systems. The bacterial secretion system is perceived to be necessary for the secretion of protective molecules that enable bacteria to withstand the harsh conditions in AMD. For example, Leptospirillum group II UBA type strains secrete a novel lyso phosphatidylethanolamine (PE) lipid (Fischer et al. 2012). This compound has a high affinity for iron and calcium ions and apparently minimizes excessive cellular uptake of metal ions (Fischer et al. 2012). Similarly, taurine and hydroxyectoine detected through a metabolomics study of microbial biofilm communities from the Richmond Mine are linked to cellular osmotic stress protection (Mosier et al. 2013). Accordingly, the recovery of functions associated with bacterial secretion systems in the water and sediment metagenomes may be linked to survival strategies in the harsh environment. DNA repair recombination as well as chaperons and folding catalysts were other putative functions recovered in high abundance in both the sediments and water metagenomes. These as well are associated with mechanisms for cellular protection. Since the extreme conditions in AMD systems also cause a lot of damage to biomolecules, AMD microorganisms have evolved an efficient repair system and often depict an overexpression of genes involved in DNA repair and recombination as well as chaperons and folding catalysts for protein repair. Accordingly, genomes of several extreme acidophiles such as Picrophilus torridus show a high prevalence of DNA and protein repair genes (Crossman et al. 2004). Furthermore, chaperones involved in protein refolding were recovered in high prevalence in the metaproteome of Richmond Mine biofilms (Ram et al. 2005).

Other studies have reported on the recovery of functions with significant bioremediation and biotechnological applications such as dioxin, tolune, and polyaromatics degradation (Sibanda et al. 2019). However, such functions were not recovered in this study. This could either mean that these functions are lacking or they may have been missed, highlighting the inadequacy of PICRUSt in environmental studies. PICRUSt has been shown to be more effective in predicting the metagenomes of environments for which extensive gene collections are available in reference gene databases (Langille et al. 2013). For peculiar environments such as extreme environments, gene predictions may be sparse and less accurate (Mesa et al. 2017).

Conclusions

The present study reports on the physicochemical properties of two microhabitats within a highly acidic, metal-rich, and oligotrophic environment. Furthermore, the bacterial community structure was profiled using a high-throughput Illumina sequencing approach. The bacterial community structure in sediments and the water column are characterized by low diversity which is often encountered in extremely acidic environments. The phylum Proteobacteria occurred in high abundance in both the water and sediment bacterial communities. Although the bacterial communities were characterized by common AMD inhabitants such as Acidiphilium, other taxa that are less frequently encountered in AMD sites were detected. The estimated community diversity and species richness indicate that the sediments harbour higher bacterial diversity and species richness compared to the water column. Judging from the abundance of sequence reads associated with uncultured bacterial clades, this often overlooked microhabitat may be a source of novel yet untapped bacterial diversity. A suite of physicochemical parameters play a role in structuring the bacterial community and taxon co-occurrence networks in both microhabitats.