Introduction

Freshwater ecosystems, which account for circa 2.5% of the total volume of water available on our planet, are extremely different in composition and, according to that, distinct microbial communities have been established, as revealed by both cultivation and molecular-based and, more recently, by meta-genomic [10] approaches. Some major discoveries have been made in the last years on very peculiar microbial life adapted to the water column of many lakes; however, so far, only little is known, and even less is understood about the microbial ecology and gene inventory of anoxic freshwater lake sediments.

Anoxic lake sediments from around the world, including those from saline and alkaline soda lakes [2, 72], hypersaline lakes [73], athalassohaline lakes [28], shallow suboxic-to-anoxic freshwater ponds [5], sulfurous karstic lake [44], eutrophic lakes, including shallow [17, 48, 74, 76], sulfur-rich minerotrophic peatlands [31], warm monomictic and meso-eutrophic lakes [68, 69], freshwater tidal marshes [82], meromictic lakes [39, 42], as well as metal mining-impacted lakes [8, 13, 19, 59], have been studied, but most of this interest is centered on their phylogeny. Most of the communities were dominated by (un)culturable methane-producing archaea Methanomicrobiales, Methanobacteriaceae, and Methanosarcinales and Crenarchaeota from uncultivable groups such as Miscellaneous Crenarchaeota group, Marine Group I, Marine Benthic Group B and C, Freshwater group, Group I3, and Rice Clusters IV and VI. Crenarchaeota represented the majority of the microbial population in mercury-contaminated freshwater stream [59] and sulfurous karstic lake sediments [44]. In addition, delta- and epsilonproteobacterial sulfate- and, in some cases, iron(III)-reducers [21, 22] represent the main metabolic bacterial components of the communities.

So far, meta-genomic studies in anoxic sediments and, in particular, in mining-impacted lakes, are rare and only few recent studies have identified abundant key prokaryotes and linked them with essential metabolic processes and environmental adaptations [13]. The objective of our study was to investigate the prokaryotic community inhabiting in the anoxic sediment of the sub-saline shallow lake Laguna de Carrizo, in Central Spain, and highlight the metabolic particularities of this aquatic environment which previously was operated as a gypsum mine. The Carrizo Lake is characterized by an unusual prevalence of Ca2+, Mg2+, and SO 2−4 , together with a low concentration of other biogenic mono-valent cations (see details in “Methods” section).

Methods

Study Site, Sampling and DNA Extraction

Laguna de Carrizo, located in Madrid (+40° 18′ 30.99″, −3° 39′ 34.70; area approximately 12 km2; maximum depth 2.4 m; altitude 521 m; Figure S1 in the Electronic Supplementary Material), represents a unique ecosystem in the Central Iberian Peninsula. The area of Carrizo was used since the seventeenth century to mine gypsum (CaSO4 × 2 H2O) to supply a wide range of industries. In 1977, when the ground water level was reached, the mine was abandoned and the upwelling of subterranean water filled the excavated area. In 1990, the area was declared an abandoned industrial site whose restoration is of environmental interest, and since 2004, it belongs to the Drainage and Wetland Regional Catalogue of Madrid (Spain). In Laguna de Carrizo, water presents a conductivity of 3,160–4,910 μS/cm (sub-saline water), a pH of 7.70, and a transparency (or light penetration) of 1.8 m, and contains circa 15 g/L of salts. The chemical and mineralogical analyses, done according to Standard Methods (APHA 1998 and ref. [70]), revealed that the sediment contained Ca2+ (2.43–2.63 g/L), Mg2+ (0.40–1.49 g/L), Na+ (0.09–0.19 g/L), K+ (0.004–0.04 g/L), NH +4 (989–1,249 μg/L), and Fe (0.07–0.11 mg/L). The major ions were SO 2−4 (6.96–10.89 g/L), S2O 2−3 (3.7–5.0 mg/L), polysulfide (6.5–10.5 μg/L), SO 2−3 (2.5–5.7 μg/L), PO 3−4 (3.2–3.5 μg/L), Cl (0.1–0.27 g/L), HCO 3 (0.26–0.42 g/L), and NO 3 (54–744 μg/L). CO 2−3 , NO 2 , methane (CH4; measured by gas chromatography for analysis of gaseous hydrocarbons), and heavy metals (as measured by inductively coupled plasma analysis) were not detectable. Silicate was also found at a concentration ranging from 44 to 100 mg/L. Organic compound analyses indicated that the sediment contained organosulfonates such as taurine (2-aminoethanesulfonate; 0.14 μg/kg) and cysteate (2-amino-3-sulfopropionate; 0.68 μg/kg).

On February 15, 2007, superficial (0 to 20 cm depth) sediment samples (at a depth of 2.4 m) were collected using a Petite Ponar® clamshell-style dredge. The overlaying water was O2-free, as determined with the Winkler method. The sample was stored at −20°C until DNA was extracted. DNA was isolated directly from cells previously separated from the environmental matrix. Briefly, suspensions of microbial consortia were obtained by density gradient centrifugation with Nycodenz (Axis-Shield PoC, Norway) as described previously [20]. The resulting cell pellet was subjected to metagenomic DNA extraction using the commercial kit GNOME®DNA (QBIOgene). DNA was visualized by using 0.7% (w/v) agarose gel electrophoresis and quantified both spectrophotometrically and with PicoGreen (Molecular Probes, Carlsbad, CA).

Chemicals and Enzymes

Chemicals, biochemicals, and solvents were purchased from Sigma-Fluka-Aldrich Co. (St. Louis, MO) and were of pro analysi quality. Oligonucleotides for DNA amplification and sequencing were synthesized by Sigma Genosys Ltd. (Pampisford, Cambs, UK). Restriction and modifying enzymes were from New England Biolabs (Beverly, MA). Ni-NTA His⋅Bind chromatographic media was from QIAGEN (Hilden, Germany). Escherichia coli strains GigaSingles for cloning and BL21(DE3) for expression using the pET-41 Ek/LIC vector (Novagen, Darmstadt, Germany) were cultured and maintained according to the recommendations of the suppliers. All recombinant enzymes used in the present study were polymerase chain reaction (PCR)-amplified utilizing a PCR-based strategy and custom oligonucleotide primers, cloned, expressed, purified, and their kinetic parameters determined as described in SI Methods in the Electronic Supplementary Material.

Construction of 16S RNA Gene Clone Libraries and Clone Sequencing

PCR amplification was performed with a serial dilution of DNA template. Bacterial 16S RNA genes were amplified using the bacterial-specific primers F27 (5′-AGAGTTTGATCMTGGCTCAG-3′) and R1492 (5′-CGGYTACCTTGTTACGACTT-3′). To analyze in more depth the Planctomycetes, we also used Pla f949 (5′-GCGMARAACCTTATCC-3′) and Pla r1408 (5′-CCNCNCTTTSGTGGCT-3′) that are Planctomycetes-specific primers. Archaeal 16S RNA genes were amplified using the archaeal-specific primers Ar20F (TTCCGGTTGATCCYGCCRG) and Ar958R (YCCGGGGTTGAMTCCAATT). Amplification was done in a 20 μl reaction volume with recombinant Taq DNA Polymerase (Invitrogen, Germany) and original reagents, according to the basic PCR protocol, with the annealing temperature of 45°C and 50°C (bacterial and archaeal rRNA, respectively), for 30 cycles. PCR amplicons were purified by electrophoresis in 0.8% (w/v) agarose gels, followed by isolation from excised bands using a QIAEX II Gel Extraction Kit (Qiagen, Germany). The purified PCR products were ligated into plasmid vector pGEM (pGEM Cloning kit, Invitrogen, Germany) with subsequent transformation into electrocompetent cells of E. coli (TOP 10; Invitrogen, Germany). Clones of bacterial and archaeal rRNA were sequenced using primers M13 forward (5′-GACGTTGTAAAACGACGGCCAG-3′) and M13 reverse (5′-GAGGAAACAGCTATGACCATG-3′), according to the protocol for BigDye Terminator v3.1 Cycle Sequencing Kit from Applied Biosystems (USA). The sequencing reactions were performed using an AB 3730 apparatus from Applied Biosystems (USA).

Phylogenetic Analyses of 16S RNA Gene Sequences

Phylogenetic inference was carried out using the ARB software package [46]. Sequences were automatically aligned using the SINA aligner against SILVA SSURef 100 [60] and LTP s100 [84] reference alignments and manually inspected to correct misplaced bases. To improve resolution at lower taxonomic levels, three independent reference phylogenetic trees were reconstructed, one comprising just members of the phylum Proteobacteria, a second with the remaining bacterial phyla, and a third just comprising the domain Archaea. The distinct datasets with almost complete SSU sequences were first sieved with a 30% conservational filter, and then the phylogeny was reconstructed with the neighbor-joining algorithm using the Jukes-Cantor correction. The resulting tree topologies were carefully checked against the currently accepted classification of Prokaryotes (LPSN, http://www.bacterio.cict.fr) to verify the absence of incongruent phylogenetic relationships.

For this study, the sequences were grouped in operational taxonomic units (OTUs), assuming that one OTU includes sequences with similarity values equal to or higher than 97%, using the software DOTUR [67]. Additionally, we considered an operational phylogenetic unit (OPU; [45]) to be represented by each single group of clones forming an independent clade in the tree without regarding any rigid similarity cut-off value. Both OTUs and OPUs were plotted to obtain rarefaction curves (Figure S2 in the Electronic Supplementary Material). Statistical analyses were performed using the PAST program.

Library Construction and Sequencing

Cosmid libraries using the pLAFR3 vector and E. coli DH5α were constructed according to Guazzaroni et al. [20]. The cosmid library consisted of 6,500 clones with an average insert size of 29.7 kb (ca. 193 Mbp) that were picked with a QPix2 colony picker (Genetix Co., UK) and grown in 384-microtiter plates containing LB with tetracycline (10.0 μg/ml) and 15% (v/v) glycerol and stored at −80°C. Three hundred eighty-four cosmid clones were randomly selected and fully sequenced with a Roche GS FLX DNA sequencer (454 Life Sciences; Life Sequencing S.L, Valencia, Spain). Additionally, the full library (6,500 clones) was subjected to functional screens with α-naphtyl acetate (for detecting esterase activity) and o-dianisidine/H2O2 (for detecting peroxidase activity) following conditions described elsewhere [81]. Eight clones (two esterase and six peroxidase positives) were selected and further sequenced as a pool with a Roche GS FLX DNA sequencer and the resulting sequence added to that of 384 randomly selected clones.

Assembly was performed by Newbler—tool GS De Novo Assembler v.2.3 (Roche). The estimated error rate: (incorrect bases/total number of expected nucleotides) of 0.49% has been considered for GS20 reads [26]. The error rates for GS20 reads were calculated using the Needleman–Wunsch algorithm [52].

Cosmid Sequences Analysis

Gene prediction was carried out using the Metagene software [56]. Batch cluster analysis of metagenome sequences was performed with the GenDB v2.2 system [50] by collecting for each predicted open reading frame observations from similarity searches against sequence databases (nr, SwissProt [3]), KEGG [30], COG [77], genomesDB (see next paragraph), and protein family databases (Pfam [15]) and InterPro [25]). Predicted protein coding sequences were automatically annotated by the software MicHanThi [61]. The MicHanThi software predicts gene functions using a fuzzy logic-based approach based on similarity searches using the NCBI-nr (including Swiss-Prot) and InterPro database. Furthermore, manual annotation and data mining were performed by using JCoast, version 1.6 [62].

To highlight the phylogenetic consistency, all proteins were searched for similarity by BLAST analysis for the phylogenetic distribution of best hits against genomesDB with a cut-off with expectation E value below 1e−05. Genome DB [62] is a composite database built from the proteome FASTA files obtained from the National Center for Biotechnology Information (NCBI) Reference Sequences database (RefSeq) for all fully sequenced bacterial and archaeal genomes. Each genome, chromosome, and protein in the file was tagged with a unique internal numerical identifier. In addition, taxonomic and contextual information was parsed from the NCBI Entrez Genome Project database. When available, further contextual data were included pertaining to genome size, guanine-cytosine content, Gram staining, shape, arrangement, endospore formation, motility, salinity, oxygen, habitat, and temperature range.

To identify potential metabolic pathways, genes were searched for similarity against the KEGG database. A match was counted if the similarity search resulted in an expectation E value below 1e−05. All occurring KO (KEGG Orthology) numbers were mapped against KEGG pathway functional hierarchies and statistically analyzed.

MIENS Submission

Consistent contextual data acquisition for MIENS-compliant submission has been done using the Web-based software MetaBar [23].

Results and Discussion

General Features

On February 15, 2007, 250 g of sediment surface samples were collected from Laguna de Carrizo (details of sampling site and chemical compositions is given in “Methods”). The sample is about half salty as sea water, with divalent cations (~2.5 g/L Ca2+ and ~0.7 g/L Mg2+) dominating over monovalent (90–190 mg/L Na+, 4–40 mg/L K+ and 0.9–1.2 mg/L NH +4 ), with 95% being SO 2−4 , 3.6% HCO 3 , and 1.3–2.3% Cl; nitrate and phosphate were present at trace concentrations. This ion composition contrasts with that observed in seawater-like environments (including common and solar saltern environments) where Na+ and Cl dominated (e.g., [45]), as well as in deep sea environments where Ca+2 is found at a ratio 1:200 as compared with the dominant ions [9]. Examples of Ca2+-rich environments are the calcium sodium, chloride solution of Soudan Mine [13], and the calcium carbonate (CaCO3) sediments of profundal lake sediment Lake Kinneret [69]; however, in Carrizo sediment, the concentration of SO 2−4 is more than ten times higher than found in those environments and also in subterranean water bodies and aquifers [7]. The Dead Sea represents an example of an environment where Mg2+ and Ca2+ dominated (albeit their concentration exceed from 7 to 32 times that found in Carrizo lake; [4]) over Na+ and K+; however, the sulfate and chloride concentrations in Carrizo lake are 28- and 852-fold lower and higher, respectively, that in the Dead Sea. Therefore, the chemical analysis revealed the unique characteristics of the anoxic sediment of the sub-saline shallow Carrizo Lake.

Total DNA was extracted for PCR-based 16S rRNA gene diversity survey of the community structure of the Carrizo sediments. In addition, we generated a pLAFR3 library of about 6,500 clones with an average insert length of 29.7 kbp that was sequenced with a Roche GS FLX DNA sequencer; the approximate total archive of 193 Mb yielded about 92 Mb of raw DNA sequence.

Prokaryotic Community Structure of the Carrizo Lake’s Sediments

The Good’s coverage index [18] for the 106 bacterial OTUs [86] and 65 OPUs [45] identified in the 195 16S rRNA clones range from 0.65 to 0.86, respectively (Table 1). Similar coverage results were obtained for the archaeal clone library (52 sequences) in which the Good’s coverage indexes were 0.79 for the 22 OTUs and 0.85 for the 18 OPUs. It is difficult to interpret what the sequence diversity of a given clade means in terms of populations of naturally occurring species [36], but OPUs may be considered from the taxonomic point of view to be equivalent to genera and, in some cases, families from the taxonomic point of view [84]. The indicated level of discrimination proved that the community was diverse (Fig. 1; Table 1), but also that a satisfactory coverage of the microbial diversity had been achieved in both libraries (see SI Text in the Electronic Supplementary Material for additional information). The phylogenetic reconstruction showed that the sequences were scattered throughout the whole phylogenetic tree in accordance with the large estimated diversity (Fig. 1 and Figures S3, S4, and S5 in the Electronic Supplementary Material). It is noteworthy that about 22% of the total rRNAs genes cloned (representing about 20% of the OPUs) showed similarities below 94.5% with any SSU sequences currently deposited in public repositories (either from cultured organisms or from environmental clones).

Table 1 Statistical indexes
Figure 1
figure 1

Phylogenetic reconstruction of bacterial and archaeal 16S rRNA gene clones in the library derived from Carrizo sediment. Percentages of bacterial phylogenetic lineages detected in 16S rRNA gene clone library based on OPUs and the composition of the major groups (Delta-, Beta- and Gammaproteobacteria) and Euryarchaeota are shown in detail

Among the different bacterial phylotypes recovered, almost two thirds of the sequences were identified as belonging to the phylum Proteobacteria (Fig. 1, and Figure S3 and Table S1 in the Electronic Supplementary Material). The most abundant sequences in the library affiliated with the Beta- and Deltaproteobacteria, whose sum encompassed nearly 50% of proteobacterial clones (37% of the OPUs). A large fraction of these sequences (42% and 16%, respectively) did not affiliate with any known family and clustered with branches represented only by uncultured microorganisms, mostly recovered from anaerobic communities in lake and river sediments, in the waters of freshwater reservoirs, wetland soils, as well as calcite, karst, and calcite travertine systems [66], microbial mats from aphotic (cave) sulfidic springs [14] and hot springs [38], gold mine water streams [24], acid mine drainage systems, as well as marine sediments [32, 43], and (an)aerobic wastewater digesters [29, 63]. Twenty-five percent of all betaproteobacterial OPUs were associated with the highly versatile genus Burkholderia and, to a lesser extent, to potential chemolithotrophic iron- and sulfur-oxidizing organisms such as Gallionella spp. and Thiobacillus spp., and putative phototrophs such as Rhodocyclus spp. Almost 24% (or 19% of the OPUs) of all clones affiliated with Deltaproteobacteria, a class which comprises the major group of sulfate-reducing bacteria (SRB). The most represented SRB sequences affiliated with Desulfobacteraceae, Desulfobulbaceae, Syntrophaceae, and Syntrophobacteraceae, which together made up circa 75% of the deltaproteobacterial sequences. The third major group of phylotypes detected affiliated with the Gammaproteobacteria class (Fig. 1 and Table S1 in the Electronic Supplementary Material) encompassing 12.8% of the clones (13.8% of the OPUs). Among these sequences, a large proportion affiliated with purple sulfur bacteria (Chromatiaceae; typical inhabitants of stagnant pools) and versatile heterotrophs such as Pseudomonas- and Xanthomonas-like organisms, followed by sulfur-oxidizing phototrophs such as Lamprocystis, and one sequence distantly affiliated to methanotrophic organisms such as Methylocaldum (Figure S3 in the Electronic Supplementary Material). They were closely related to communities found in solar salterns [1, 2, 80], waters, and sediments of freshwater reservoirs [55, 83], wetland soils as well as karst and phreatic sinkholes and deep-sea marine sediments [82]. Eleven clones (5.6%) affiliated to Alphaproteobacteria that were composed in essence of Sphingomonadaceae- and Rhodobacteraceae-like organisms. Epsilonproteobacteria, constituting 2.1% of the Carrizo Lake bacterial clones, were affiliated to organisms distantly related to chemolithotrophic Sulfurovum litotrophicum and Sulfurimonas autotrophica, both involved in the redox sulfur cycle, and to uncultured bacteria from activated wastewater sludges. The remaining sequences in Carrizo Lake bacterial library (most closely related to sequences recovered from freshwater environments, including phreatic sinkholes), which represent about 35% of all the OTUs and OPUs, were related to Acidobacteria, Bacteroidetes, Fibrobacteres, Firmicutes, Lenthisphaerae, Nitrospirae, and to the candidate divisions JL-ETNP-Z39, OP3, TA06, TM6, WS3, and WS1 with no cultivable organisms (Fig. 1, and Table S1 and Figure S4 in the Electronic Supplementary Material).

The above phylogenetic analysis of bacterial clone sequences (with Proteobacteria being predominant) resulted in overlaps with sequences from other lakes, including saline lakes such as karst and calcite travertine systems and from (an) aerobic wastewater digesters [5, 29, 63]. The Proteobacteria are commonly observed in waters and sediments from other saline and freshwater lakes, and thus they do not appear to be specific for Carrizo anoxic sediment. The Beta- and Deltaproteobacteria, by far the most abundant in Carrizo sediment, appear to be numerically important in anoxic sediment from freshwater lakes, the last one playing a cardinal role in anoxic settings, including anoxic lakes [5, 12, 41]. Based on proximity to cultivated species of known physiology, at least seven different metabolic types could be hypothesized to the Beta- and Deltaproteobacteria inhabiting the Carrizo anoxic sediment: iron- and sulfur-oxidizing organisms (Gallionella- and Thiobacillus-like), denitrification bacteria (Sterolibacterium- and Denitratisoma-like), sulfate-reducers (Desulfobacca-, Desulfosarcina-, Desulfococcus-, and Desulfocapsa-like), methylotrophs (Methyloversatilis-like), synthrophic bacteria (Syntrophus-like that typically establish interspecies H2-transfer symbioses with methanogenic Archaea [5]) and dehalogenating (Desulfomonile-like) and phenol-degrading (Syntrophorhabdus-like) bacteria. The Gamma- and Alphaproteobacteria were also abundant, with six clones closely related to cultivated phototrophic sulfur bacteria that oxidize reduced sulfur species (e.g., Lamprocystis- and Thiorhodovibrio-like) and one to nitrogen-fixing methanotrophs (Methylocaldum and Methylococcus-like). Therefore, most Gammaproteobacteria could in fact be oxidizing H2S, S0, or thiosulfate in the Carrizo sediment, as reported also in similar freshwater lakes [5]. Finally, Epsilonproteobacteria (most closely related to those found in anaerobic digestors), which are naturally associated with sulfide-rich environments and sulfur spring, were less abundant, thus suggesting that oxidizing sulfide or sulfur capabilities in Carrizo sediments are less represented as compared with other metabolic process. Although they are absent or rare in common freshwater lakes [11], they appear particularly abundant in oxic/anoxic interfaces (redox clines) in marine environments and suboxic/anoxic lake sediments [5].

By contrast to the previous observations, candidate divisions TA06 and WS1, for which six distinct clones were found in our study, appear to be unique in saline lakes and marine sediments. In Carrizo sediment, TA06 formed a cluster with three clones related to communities from phreatic sinkholes and three marine (including one estuarine) sediments. WS1-related clone were closely related to a phylotype retrieved from a phreatic sinkhole and a hypersaline microbial mat [43]. Thus, it appears that sediment conditions, which are considerably distinct from those existing in other freshwater ecosystems, could explain the presence of TA06 and WS1 members in the anoxic sediment herein investigated. Unfortunately, since there are no cultured representatives related to our clones, their physiology (e.g., in relation to salinity) remains unknown (Fig. S5 in the Electronic Supplementary Material).

Most of the 52 sequenced archaeal clones (88.5%) are affiliated with Euryarchaeota and encompassed 15 OPUs (Fig. 1, and Table S1 and Figure S5 in the Electronic Supplementary Material). Only three OPUs (six distinct clones) affiliated with the uncultured Crenarchaeota groups Marine Benthic Group B (MBG-B) and Miscellaneous Crenarchaeotic Group (MCG), thus appearing that this archaeal clade plays a minor role in this habitat. The sequences of the first cluster were closely related to uncultured archaeon clones recovered from an anaerobic sludge digestor [63] and a low-pH (≤4) minerotrophic fen [6]. The sequences in the second cluster were related to uncultured Crenarchaeota from hypersaline microbial mat [64] and sulfur-rich submerged sinkhole ecosystems. Among the Euryarchaeota sequences, only three OPUs represented by five clones could be affiliated with potential methanogenic Archaea. From these, only two clones were related to Methanobacteria typically detected in the anoxic sediments at the bottom of ponds and marshes [87] whereas one clone was most closely related to the methanogenic genus Methanosaeta, frequently detected both in anaerobic methane-producing bioreactors and in shallow marine sediments rich in methane [47, 79]. The low number of methanogenic Archaea identified, together with the fact that no clone sequences recovered matched closely to known sequences recovered from methanogenic sediments, further indicated that methanogenesis might be a minor metabolic process in Carrizo Lake, as compared with common anoxic lake environments (e.g., [12, 69, 85]), where this process dominated. This agrees with previous observations in saline and alkaline soda lakes where the high sulfate and salt concentrations repressed autotrophic methanogens while promoting active sulfur cycle (e.g., [72]).

However, the largest set of Euryarchaeota sequences (comprising 41 sequences and 12 OPUs) affiliated with the uncultured Thermoplasmatales CCA47 group, for which two clusters were identified. Twenty-seven and 14 sequences formed the first and second cluster, and they were related to archaeal communities of a variety of marine sediments [33], iron- and sulfur-precipitating microbial mats at submarine mud Volcano [57] and microbial mats of hypersaline coastal lagoons [27], deep sinkhole ecosystems, and salt marine marsh sediments [53, 54], respectively. Thus, it appears that Thermoplasmatales CCA47 sequences belong to organisms highly adapted to conditions existing in saline, but not common freshwater, ecosystems.

Taken together, whereas many bacterial clones in our study were most closely related to sequences recovered from other freshwater and marine environments and, to minor extent, to sequences from anaerobic wastewater digesters (Figures S3 and S4 in the Electronic Supplementary Material), the composition of archaeal clones showed remarkable differences. Thus, to the best of our knowledge, the presence of Thermoplasmatales CCA47 group in freshwater ecosystems (including anoxic sediments) has not been reported. Although, Schwarz et al. [69]) and Glissmann et al. [17] reported the presence of Thermoplasmales relatives of the Marine Archaea Group III (but not CCA47 group) in anoxic sediments of subtropical and eutrophic profundal lakes, those constitute a minor component of the archaeal community (below circa 17%), which was dominated by common Methanomicrobiales and Methanomicrobiaceae. The relatively close relationship of the CCA47 group with a group of cultured acidophilic and cell wall-less Archaea, also belonging to Thermoplasmatales, contrasts with the neutral and slightly alkaline nature of the pore waters of the lake. Unfortunately, due to the lack of cultivable members within the Euryarchaeota clades detected in the Carrizo Lake, little is known about the mechanisms by which CCA47-like and also MBG-B- and MCG-like organisms obtain energy, although, they are exclusively found in saline and oxygen-depleted locations [5, 28, 65, 78].

Taxonomic classification of the metagenome sequences (see “Methods” section and Table S2 in the Electronic Supplementary Material) was mostly in line with 16S tag analysis for known taxa (Figure S6 in the Electronic Supplementary Material). However, it should be noticed that this pipeline cannot identify poorly studied taxa without known reference sequences or protein-coding genes as they occur in particular in the Carrizo sediment (e.g., candidate divisions TA06 and WS1 and Thermoplasmatales CCA47 group). Despite this limitation, the taxonomic binning of the metagenome confirmed the rRNA-based observations with the dominance of proteobacterial related sequences (see SI Text in the Electronic Supplementary Material for additional information).

Functional Signatures for Sulfur and Nitrogen Metabolisms

Freshwater ecosystems are a general focus of intense research, but most of this interest is centered on their phylogeny (using 16S rRNA sequence analysis and related techniques), contrasting with the limited information about the gene inventory via meta-genomic [10, 13], which may shed light on microbial ecology of distinct ecosystems. Here, the metabolic potential that is expected to be present in Carrizo lake sediments was analyzed, with a particular focus in the sulfur- and nitrogen-associated processes (see details in Tables S3–S5 in the Electronic Supplementary Material).

The metagenome contained 1,333 assembled contigs (71 of them with lengths from 10 to 30 kbp) representing 3,690 CDS (coding sequences) with an average read length of 633 bp (Table S2 and Figure S7 in the Electronic Supplementary Material). The G+C content of each CDS was calculated, and the values were normally distributed between 78.8% and 14.7%, with a mean of 53.9% for the library. With a maximum E value criterion of 10−5, 22% of the sequences in this metagenome library did not have any sequence similarity (hypothetical proteins), and another 19% (686 sequences) were similar to proteins of unknown function (conserved hypotheticals). Thus, an important fraction of this ecosystem remains unknown, and its metabolism is difficult to be unraveled.

A total number of 46 (or 2.1% of hits with assigned function) genes coding enzymes potentially involved in the sulfur cycle were identified. As shown in Fig. 2, it is apparent that Carrizo lake community likely utilizes the dissimilatory phosphoadenosine phosphosulfate reductase system to convert sulfate (SO 2−4 ) to sulfite (SO 2−3 ), the NrfD polysulfide- and MopB thiosulfate-reductase-like systems to produce sulfide (S2−) from polysulfide (S 2−n ), thiosulfate (S2O 2−3 ) and thiosulfonate, and the sulfide oxidoreductase (SQR-like) to oxidize S2− to S 2−n that are again substrates for polysulfide reductases. Experimental proofs are provided in the SI Text and in Tables S3–S5 in the Electronic Supplementary Material. No evidence for the assimilatory sulfate reduction (by APS reductases) nor direct conversion of SO 2−3 to S2− by dissimilatory sulfite reductases (Dsr) and (thio-) sulfate-oxidation (mediated by Sox multi-enzyme complex) was found; however, the possibility of a metagenome bias (low genome coverage) cannot be ruled out because the presence of genes coding the first two enzymes has been demonstrated by PCR-based approaches using degenerated primers [37]. The identification of five thiosulfate/cyanide sulfurtransferases (RhoD), two aryl sulfotransferases (see SI Text and Table S4 in the Electronic Supplementary Material for experimental evidences) and two sulfatases (EC 3.1.6.-) is also supportive for the active utilization of thiosulfates and (aryl) sulfate esters as sources of sulfite and sulfate, respectively. Furthermore, a YedY-like sulfite oxidase, involved in the reduction of linear and cyclic sulfoxides, organosulfonates, and N-oxides [31] but lacking sulfite-oxidizing activity, was identified and confirmed experimentally (see SI Text and Table S4 in the Electronic Supplementary Material), which suggests that these types of compounds (detected in the sediments by chemical analysis) can potentially be used as sulfur source by Carrizo community members. To the best of our knowledge, this is the first time to report YedY-like sulfite oxidase activity in an anoxic (including both freshwater and marine) environment.

Figure 2
figure 2

Proposed sulfur-metabolizing profile of the Carrizo community based on BLAST hits of protein homologues found in the metagenome data. The number of putative genes encoding for each particular enzyme class involved in the potential transformation of each molecule is specifically shown in brackets

Forty-one genes (or circa 2.0% of hits with assigned function) coding enzymes potentially involved in the assimilation and transformation of N-sources, namely diatomic nitrogen (N2) (nitrogen fixation), nitrate (NO 3 ), nitrite (NO 2 ), ammonium (NH +4 ), as well as N-oxides and nitro-, nitrile-, and cyanide-substituted (including aromatic) compounds (widely distributed in environments associated with industrial wastewater and residual agricultural chemicals [34]) were identified (Fig. 3 and Tables S3–S5 in the Electronic Supplementary Material). They include five NifX/B-like dinitrogenases, six IscU/NifU-proteins related to dinitrogen fixation, two nitrate/nitrite transporter (NarK), six nitrate reductase-like proteins (NarG,H,I,J; see SI Text and Table S4 in the Electronic Supplementary Material for experimental evidences), one nitrite reductase (NrfC), four nitro-reductases (three of them experimentally characterized; SI Text and Table S4 in the Electronic Supplementary Material), two nitrilases/cyanide hydratases (one experimentally characterized; SI Text and Table S4 in the Electronic Supplementary Material) and three 2-nitropropane-like dioxygenases potentially involved in nitrite production from nitropropane (one experimentally characterized; SI Text and Table S4 in the Electronic Supplementary Material). Additionally, a QueF-like nitrile reductase likely responsible of the direct NADPH-dependent biological reduction of nitrile functional groups to a primary amine (rarely been observed in biological systems [40]) was identified and further characterized experimentally (SI Text and Table S4 in the Electronic Supplementary Material). The presence of nitropropane dioxygenase activity was unexpected for an anaerobic environment and might be explained by either sedimentation of genomic debris from the interface or by the presence of symbiotic bacteria–eukaryote associations as reported previously for a RuBisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) protein in a deep-sea ecosystem [49]. Finally, we detected a number of nitrogen regulatory proteins plus a number of phosphotransferase systems (nine hits in total, coverage >30%), which possibly play a role in nitrogen assimilation [58].

Figure 3
figure 3

Proposed nitrogen-metabolizing profile of the Carrizo community based on BLAST hits of protein homologues found in the metagenome data. The number of putative genes encoding for each particular enzyme class involved in the potential transformation of each molecule is specifically shown in brackets

It cannot be excluded that other enzymes relevant for the sulfur and nitrogen cycle slipped detection due to low meta-genome coverage; however, the gene inventory herein provided complement the metabolic activities suggested by the 16S rRNA (based on proximity to cultivated species of known physiology). In fact, the meta-genomic data suggested the operation of a sulfur cycle (HS→S 2−n →HS) and, moreover, that thiosulfate- and thiosulfonate-reducing bacteria are positioned at a decisive stage. Finally, the identification of rhodanases and nitrilases (whose presence cannot be suggested by 16S rRNA tags), suggested that functions for these enzymes may not be only cyanide detoxification (whose presence we were not able to detect in situ) but that they may be related to the production of ammonium (NH +4 ) and SO 2−3 , thus contributing to the S- and N-cycles, in contrast to other saline ecosystems [37, 72].

We further investigate the presence of sulfur/nitrate assimilation clusters as they may be selectively favorable because it facilitates the coordinated expression of the constituent genes [51, 71]. We further perform tentative taxonomic assignments based on BLAST hits. As a result of this analysis, four assembled contigs were found to possess genes coding proteins for the assimilation of both S- and N-sources. Briefly, the 8,446-bp-long contig cLDC0361 (45.09% GC content) contains a full set of genes which encode proteins required for the biosynthesis of sulfur-containing aminoacids, cysteine, and methionine, linked with a nitrogen-fixing NifU domain protein (LDC_0888). The majority of the genes in cLDC0361 were most similar to Syntrophus members, the sixth most represented microorganism in the Carrizo community. Additionally, the 11,467-bp long cLDC0380, which possesses a much lower GC content (34.23%), appears to encode for enzymes of the sulfur cycle, namely, the reduction of (thio) sulfate to hydrogen sulfide by LDC_1013 and polysulfide to sulfide by LDC_1015 (for which experimental evidences are given in SI Text and Table S4 in the Electronic Supplementary Material), as well as enzymes for the assembly and activation of the NifU nitrogenase catalytic components (i.e., the [Fe-S] cluster and the molybdopterin co-factor). cLDC0380 was found to be highly syntenic to Epsilonproteobacteria (whose sequences encompass 2.9% of the total 16S riboclones (Table S1 in the Electronic Supplementary Material) and 9% of the total BLAST hits (Figure S6 in the Electronic Supplementary Material) of the metagenome) with 41% and 25% of all genes belonging to Wolinella succinogenes DSM 1740 and Arcobacter butzleri RM 4018, respectively. The 16,889-bp-long contig cLDC0376 (32.46% GC content) appears to encode for two NifBX dinitrogenase iron-molybdenum cofactor biosynthesis proteins (LDC_0986 and LDC_0987), two cobyrinic acid a,c-diamide synthases that use glutamine or ammonia as a nitrogen source for the anaerobic biosynthesis of vitamin B12 [16] and a polysulfide-sulfur transferase (LDC_0978). The DNA fragment showed similar genomic organization as their counterparts from N2-fixing and H2S-oxidizing Sulfurovum sp. NBC37-1 (28% or five hits) and Sulfurospirillum deleyianum DSM 6946 (22% or four hits). The gene organization in this contig in relation to genomic fragments from both chemolitotrophic sulfur-respiring Epsilonproteobacteria is highlighted in Fig. 4. Finally, the 20-kbp-long cLDC0001 has two differentiated gene clusters characterized by their atypical GC content (35.68% versus 58.04%) and the presence of numerous genes with high similarity to genes found in distantly related species (Fig. 4). The high GC-containing island (position 11,600–20,012) bears a block of clustered genes encoding two NarK-like high-affinity nitrate/nitrite transporters (LDC_0007 and LDC_0008), the alpha and beta subunits of a respiratory nitrate reductase which catalyzes the reduction of nitrate to nitrite (LDC_0009 and LDC_0010, providing experimental evidences), and a chaperone required for the proper folding of the nitrate reductase (LDC_0011). Most proteins from this high GC island were most closely related to those of Albidiferax ferrireducens (formerly Rhodoferax ferrireducens), an anaerobic proteobacterium (beta subdivision) with Fe(III)-reducing capabilities, thus suggesting the presence of such metabolism in an Albidiferax-like bacterium inhabiting Carrizo sediment. In this context, the dissimilatory reduction of iron has been shown to be an important biochemical process in anoxic, mining-impacted lake sediments [8]. Upstream of this block, the genomic fragment at position 1,150–10,449 has a GC content of 35.68% and encodes a number of hypothetical proteins with no clear taxonomic affiliation.

Figure 4
figure 4

Genomic content of cLDC0376 (a) and cLDC0001 (b) contigs. The GC-content of the contig is plotted with a window of 16,889 and 20,014 nucleotides, respectively. a As shown, the genes of cLDC0376 are organized in a tight cluster preceded by a phage integrase and three transposases. The location of genes with similar genome arrangements as Sulfurovum sp. NBC37-1 and Sulfurospirillum deleyianum DSM 6946 are shown. b cLDC0001 exemplified the horizontal transfer of a nitrate assimilation gene cluster (green). The GC percentage is indicated as a blue (low) to black (medium) and red (high) gradient

The above data suggest that horizontal gene exchange between different members of the bacterial community and phage integration (e.g., cLDC0376 contains three transposases and one phage integrase) may be highly active in the Carrizo community, and, moreover, they may play important roles in the sulfur and nitrogen cycling. This may agree with the observation that, in marine sub-saline systems, horizontal gene exchange between different members of the prokaryotic communities is highly active, thus favoring adaptive evolution [35, 36, 75]. Moreover, the above analysis demonstrated that representatives of synthrophic bacteria (e.g., Syntrophus-like) and Epsilonproteobacteria are major contributors of the sulfur and nitrogen cycling in Carrizo sediments.

Conclusions

In this study, cultivation-independent metagenomic and 16S rRNA assessments were used to infer correlations between systems performance and phylogenetic and relevant genomic capacities in the microbial community inhabiting the anoxic sediment of a sub-saline shallow lake (Laguna de Carrizo), initially operated as a gypsum mine. Compared with other saline and freshwater ecosystems described to date, Carrizo Lake is characterized by an unusual ionic composition. The information retrieved agrees with the expected assemblage of organisms thriving in anoxic sediments; our study gives a comprehensive insight into the structure of the bacterial and archaeal community of a shallow anoxic lake, indicating that thiosulfate- and thiosulfonate-reducers, sulfate-reducers and iron-oxidizing, sulfur-oxidizing, denitrification, synthrophic and phototrophic sulfur bacteria are of particular importance in Carrizo sediment as compared with methanogens (predominant in common anoxic freshwater sediments). Genome data herein provided suggest that (thio) sulfates and (thio) sulfonates, polysulfides, sulfoxides, and organosulfonates, together with nitro-, nitrile-, and cyanide-substituted compounds might be major primary sources of biological sulfur and nitrogen in this niche. These metabolic capacities have rarely been observed together in open marine, sub-saline, or freshwater environments. It is likely that microorganisms in Laguna de Carrizo sediments experience episodes of extreme sulfur/nitrogen-like (including toxic) stress/pressure, where transfer of complete assimilation pathways (possibly to improve microbial fitness) is an active mechanism. Results suggest that the anthropogenic activities around the Carrizo area may have exerted strong selective pressure on the microbial community to adapt it to toxic chemicals (major abiotic stressors). Since most of the BLAST hits were associated with the Sulfurovum genus, the results suggest that members related to this genus might be highly active within the Carrizo community, thus opening new research opportunities to further investigate their metabolic arsenal. This should be of interest due to the limited genomic information described to date in anoxic saline environments [13]. Furthermore, to our knowledge, this is the first report of Thermoplasmatales CCA47 group in anoxic shallow sediments and, freshwater ecosystems, in general, and our data indicate that these members constitute a prevalent component of the Carrizo archaeal community, as compared with what was previously described in similar habitats. The fact that members of this group (together with bacterial candidate divisions TA06 and WS1) have been only found in marine and oxygen free environments, suggest salinity as a major determinant for their presence and/or abundance in Carrizo sediment. Further investigations will be required to ascertain their global metabolic role in the overall community and sediment characteristics. It should be noticed that, in addition to salinity, other environmental differences (e.g., carbon supply, sediment redox conditions, sediment depth, and relative proportion of ions) may help to explain the observed archaeal diversity patterns in Carrizo Lake.