Introduction

Their long evolutionary history of ca. 1.4 billion years (Knoll et al. 2004; Berney and Pawlowski 2006) allowed protists to colonize nearly every known habitat on our planet, including the most (poly)extreme environments (Amaral-Zettler et al. 2002, 2011; Aguilera et al. 2006; Alexander et al. 2009; Edgcomb et al. 2009; Stock et al. 2012). A large part of the protistan diversity residing specifically in such hostile environments is still undetected. This is partly because harsh environments have received much less attention than non-extreme habitats (Hauer and Rogerson 2005). Furthermore, the environmental molecular techniques that have revealed a hitherto unseen majority of protistan diversity (López-Garcia et al. 2001; Moon-van der Staay et al. 2001) have been applied to protists only relatively recently (Caron et al. 2012). The phylogenetic analyses of taxonomic marker genes such as the small subunit ribosomal DNA (SSU rDNA) amplified from environmental samples showed that the protistan diversity as seen by traditional microscopy and cultivation efforts represents only a fraction of natural protistan communities [reviewed in Epstein and López-García (2008)]. This is mainly due to their tremendous species richness, which is difficult to observe and to describe using microscopy, due to difficulties with sample fixation, small cell sizes, cryptic species and an ongoing decrease of taxonomic expertise (Sáez and Lozano 2005; Bickford et al. 2007; Caron et al. 2012).

One such habitat type––which still holds a large proportion of undiscovered protistan plankton diversity––are high salt environments. As for bacteria (Lozupone and Knight 2007; Logares et al. 2009, 2010), animals and plants (Lee and Bell 1999; Vermeij and Dudely 2000), salt is a strong environmental barrier for protists (Logares et al. 2009; Bråte et al. 2010; Heger et al. 2010; Forster et al. 2012; Dunthorn et al. 2014b), demanding specific adaptation mechanisms (McGenity and Oren 2012). It is therefore reasonable to assume investigations of high salt environments are likely to discover new evolutionary lineages in the eukaryotic tree of life, hitherto unseen in other aquatic habitats. To the best of our knowledge, only three gene-based diversity surveys targeted protists in the high salt environments of solar salterns (Casamayor et al. 2002, 2013; Triadó-Margarit and Casamayor 2013), even though microscopy studies provided an idea of the high diversity of protists in these multi-pond high salt systems (Estrada et al. 2004; Elloumi et al. 2006, 2009a; Lei et al. 2009). All three of these molecular diversity studies used the genetic fingerprinting technique denaturing gradient gel electrophoresis (DGGE) to assess protistan plankton diversity along salinity gradients. Although only few protistan sequences were obtained from excised DGGE gel bands, these data suggested not only a remarkably high diversity of protists exceeding the one of bacteria and archaea in Spanish salt ponds (Casamayor et al. 2013), but also reported a notable novelty of the discovered diversity across several taxonomic groups (e.g., Chlorophyta, fungi, Ciliophora, and Stramenopiles). For example, of the 73 SSU rDNA reads obtained from 34 different hypersaline inland and coastal sites, eight showed <90 % similarity to known reads deposited in public databases (Triadó-Margarit and Casamayor 2013). This impressively shows the gaps in our current knowledge of protists thriving in high-salt environments and highlights the need for more detailed investigations.

In this study, we used high-throughput pyrosequencing from which we obtained 80,600 high-quality reads to provide the first in-depth molecular survey of protistan communities in solar saltern ponds. We investigated whole protistan communities from interconnected salt ponds with different salinities (4, 12 and 38 %) in the Ria Formosa solar saltern, Faro, Portugal. The molecular data are accompanied by microscopy data of fluorescence in situ hybridization of fixed plankton samples to compare quantitative amplicon and morphotype data, helping the interpretation of molecular amplicon data. Using a network approach, we additionally identify the degree of novel protistan (sequence) diversity in these salt ponds.

Materials and methods

Sampling and sample preparation

Samples were taken in March 2012 in solar saltern ponds located in the Ria Formosa National Park, Faro, Portugal (37°01′N, −7°96′E). Water samples were collected at several sites in each of three ponds with salinities of 4 % (inlet pond), 12 % (intermediate salinity) and 38 % (crystallizer pond). Salinity was measured on-site using a WTW portable salinity/conductivity sensor (WTW GmbH, Germany). For nucleic acid extraction, ca. 0.5 l of sample water was drawn onto Durapore membranes (47 mm, 0.65 µm, Millipore, Germany) using a peristaltic pump (Ecoline ISM1079, Ismatec, Germany). Filters were preserved in 3 ml RNAlater (Qiagen, Germany) and immediately stored at −20 °C. For each sampled site three biological replica filters were prepared. Samples for microscopy (fluorescence in situ hybridization) were fixed with 3.7 % formaldehyde (final concentration) for 5 h at room temperature. Sample volumes of 5–10 ml were drawn onto Isopore membranes (25 mm, 0.8 µm, Millipore, Germany), rinsed with distilled water, air-dried and frozen at −20 °C until further processing.

Total RNA extraction and reverse transcription

Total RNA was extracted and reverse transcribed following Alexander et al. (2009). Briefly, total RNA was extracted using Qiagen´s AllPrep DNA/RNA MiniKit (Qiagen, Germany) according to the manufacturer`s instructions following a chemomechanical cell disruption by bead-beating (45 s, 30 Hz). Residual DNA was removed by DNase I (Qiagen, Germany) digestion. The complete removal of residual DNA was successfully checked by PCR amplification of the digested sample. The concentration of the purified RNA was determined spectrophotometrically using a Nanodrop ND-1000 UV–Vis spectrometer (Nanodrop Technologies, Wilmington, DE, USA). The integrity of the RNA was examined with a RNA 6000 PicoAssay on an Agilent 2100 Bioanalyzer (Agilent Technologies, Germany). To minimize extraction bias, total RNA from the three individual sites per pond were extracted.

RNA was reverse transcribed into cDNA using Qiagen´s QuantiTect Reverse transcription kit and random primers as supplied by the manufacturer. After transcription of each individual sample, the three replicate transcribed products of each sampling site were pooled.

Amplification and sequencing of V4 region

The hypervariable V4-region of the SSU rDNA was amplified using the primer pair TAReukV4F and TAReukREV (5′-ACTTTCGTTCTTGATYRA-3′, Stoeck et al. (2010)) yielding ca. 500 base-pair (bp) fragments. To distinguish the different samples in computational downstream analyses, the V4 forward primer was tagged with different four base pair-long identifiers (MIDs) at the 5′-end. The PCR protocol is described in Stoeck et al. (2010). To minimize PCR-bias three individual reactions per sampling site were prepared. The three individual reactions were pooled during the PCR product purification using Qiagen’s MinElute PCR purification kit and subjected to LGC Genomics (Berlin, Germany) for sequencing. The cDNA amplicon libraries were sequenced on 1/4 of a PicoTiterPlate yielding nearly 50 Mb in total with a Roche FLX GS20 sequencer and the Titanium chemistry.

Sequence data processing and analysis

The pyro-amplicon datasets were denoised with Acacia v1.48 (Bragg et al. 2012). Further data cleaning, including chimera checking, and data analyses were conducted using QIIME v1.7.0 (Caporaso et al. 2010). After quality filtering, only reads with exact barcodes and primers, a quality score >25, unambiguous nucleotides and a minimum length of 300 bp were maintained. The remaining sequences were checked for chimeras and clustered at a sequence similarity of 97 % using the clustering algorithm Uclust (Edgar 2010). The required word length value (short words of fixed length, k-mers) was calculated according to an equation given in (Edgar 2011), resulting in a value of 64 for pyro-amplicons. For taxonomic assignments, one representative sequence (longest) from each OTU was analyzed with JAguc v2.1 (Nebel et al. 2011) and GenBank’s nr nucleotide database release 199. JAguc employs BLASTn searches, with algorithm parameters adjusted for short reads (-m 7 -r 5 -q -4 -G 8 -E 6 -b 50).

Using a custom script, the output files from QIIME’s OTUpipe (seqs_otus.txt) and JAguc (taxonomic tree for each analysed representative sequence) were merged into a biom-file containing information about OTU IDs, number of sequences per OTU and per sample and taxonomic affiliations. Non-target OTUs (metazoans, embryophytes), as well as singletons and doubletons (unique/double amplicons that occur exclusively in only one sample are most likely erroneous sequencing products (Kunin et al. 2010; Behnke et al. 2011)) were excluded and the resulting file used as a basis for statistical and network analyses.

Statistical analyses

Rarefaction profiles, Shannon alpha diversity and incidence-based Jaccard and Sørensen beta diversity were calculated using R (R Development Core Team 2008) and QIIME. For this purpose, data were normalized and resampled 1,000 times to account for uneven sample sizes (Logares et al. 2012). UPGMA-clustering was applied to construct Jaccard and Sørensen distance dendrograms. Bootstrap values were calculated to assess the strength of the relationships in the UPGMA dendrogram. Incidence-based indices were preferred over abundance-based indices to circumvent the issues surrounding SSU rDNA gene copy numbers in eukaryotes (Casamayor et al. 2002; Zhu et al. 2005; Amaral-Zettler et al. 2009; Dunthorn et al. 2014a).

Fluorescence in situ hybridization

To count the total number of protists and assess the proportion of ciliates, we applied a fluorescence hybridization protocol following Stock et al. (2012). Hybridization solutions included the eukaryote-specific probe Euk1209 (CY3-5′-GGGCATCACAGACCT-3′, Giovannoni et al. (1988)) and a ciliate-specific probe (CY3-5′-TGGT(A/T)GTCCAATACACTA-3′, derived from Lara et al. (2007)), respectively. Hybridizations of both probes were carried out using 40 % formamide. Enumerations and documentations of cells occurred under epifluorescence using a Zeiss Axiophot II microscope equipped with a cooled QImager CCD Microcam (Intas). Three replicates from each solar saltern pond were counted. Controls were conducted with a hybridization mix without a probe and with a non-sense probe (reverse complement of ciliate-specific probe).

Identification of novel diversity

To identify novel protistan diversity in the different salt ponds sampled, the reads of the three cleaned datasets were all truncated to the same length (300 bp) and dereplicated using a custom script. Unique reads were clustered with SWARM v1.2.5 (Mahé et al. 2014) using d = 1. From each sample site, the 200 biggest swarms (i.e., clusters with the highest abundances) were retained and visualized with network analyses. To build the networks, the “seed reads” (i.e., the most abundant read of each swarm) were culled and subjected to BLAST analyses against Genbank’s nr nucleotide database to infer their taxonomic identity. The seed reads were then aligned with Seaview (Galtier et al. 1996), prior to calculating sequence similarities between each of the seed sequences using the custom script PairAligner (provided by Dr. Markus Nebel, University of Kaiserslautern). The igraph R package (Csárdi and Nepusz 2006) was used to build networks based on the sequence similarity values and the taxonomic information. In all networks two nodes were connected by an edge if they shared a sequence similarity of at least 90 %. The resulting networks were visualized and modified with Gephi v.0.8.2-beta (Bastian et al. 2009).

Results

Pyrosequencing resulted in a total of 104,945 V4 reads, of which 80,600 reads met our quality criteria and could taxonomically be assigned to protists and fungi. These reads are distributed as follows to the three different salt pond samples: 31,961 reads (1,062 OTUs, 4 %-salt pond), 19,406 reads (238 OTUs, 12 %-salt pond), and 29,233 reads (87 OTUs, 38 %-salt pond). Singletons and doubletons accounted for only 916 pyro-reads (0.01 % of the quality reads) pointing to near-complete sampling of the target communities; this pattern is opposite of what is normally found in high-throughput sequencing studies where singletons and doubletons account for most of the reads. Rarefaction analyses of the three pyro-datasets confirm near-saturated sampling profiles for all samples (Fig. 1). Shannon diversity estimated the highest protistan diversity in the 4 %-salt sample (4.23) decreasing with increasing salinity (3.67 in the 12 %-salt pond and 0.4 in the 38 %-salt pond).

Fig. 1
figure 1

Rarefaction curves for the different salt ponds sampled, based on pyro-reads (only target reads, no singletons/doubletons). OTUs were called with a sequence divergence of 3 %. The profiles of the rarefaction curves indicate a near-saturation for all three ponds sampled

Partitioning of diversity

Analysis of diversity partitioning using the Jaccard index shows a distinctively different protistan community structure in the 4 %-salt pond compared to the two ponds with higher salinities (Fig. 2a). The latter two cluster together with a distance being smaller to each other than to the community from the 4 %-salt pond. The same result was obtained with the Sørensen index (data not shown).

Fig. 2
figure 2

a UPGMA clustering of Jaccard beta diversity based on OTUs called at 97 % sequence similarity. Bootstrap values indicate the strength of the relationships in the dendrogramm. The high-salt samples (12 and 38 % salinity) are more similar to each other than to the 4 %-salt sample regarding protistan (incl. fungi) community composition. b Taxonomic distribution (phylum-based assignment) of protistan and fungal OTUs97 % among the sampled salt ponds (expressed in relative proportion of OTUs97 % in  %). Phyla that were represented by a proportion of >1 % of all unique OTUs97 % in at least one of the three amplicon libraries is shown. The category “others” denotes OTUs97 %, which fall into taxonomic entities that were represented by <1 % of the unique OTUs97 % in all of the three amplicon libraries. These are Apusozoa, Fornicata, Centroheliozoa, Choanoflagellata, Haptophyta and Fungi. The red dashed line indicates the transition boundary between marine (saline) water to hypersaline water (brine)

The retrieved OTUs are distributed across 14 high-rank taxonomic groups (Ciliophora, Dinozoa, Amoebozoa, Cryptophyta, Cercozoa, Stramenopiles, Chlorophyta, Apusozoa, Fornicata, Centroheliozoa, Choanoflagellata, Chytriomycota, Zygomycota and Haptophyta) matching 120 eukaryotic families (Figs. 2b, 3 and Online Resource 1). In the 4 %-salt pond, most of the OTUs blast to the Stramenopiles (140 OTUs/42.2 %), of which the Bacillariophyta account for 87 OTUs [26.2 %; mainly belonging to the Naviculaceae (21 OTUs) and Bacillariaceae (18 OTUs)] and the Ciliophora [63 OTUs/19 %, mainly belonging to the Strombidiidae (17 OTUs) and Strobilidiidae (9 OTUs)]. Cryptophyta [40 OTUs/12 %; mainly Geminigeraceae (17 OTUs) and Cryptomonadaceae (14 OTUs)], Chlorophyta [34 OTUs/10.2 %, mainly Mamiellacea (10 OTUs), Bathycoccaceae and Halosporaceae (4 OTUs, respectively)], Cercozoa [22 OTUs/6.6 %; mainly Cercomonadidae (5 OTUs) and Thaumatomastigidae (4 OTUs), Dinozoa (14 OTUs/4,2 %; mainly belonging to Gymnodiniaceae (4 OTUs), Amphidomataceae and Oxyrrhinaceae (2 OTUs, respectively)] and Amoebozoa [5 OTUs, belonging to Hartmannellidae (4 OTUs) and Flabellulidae (1 OTU)] are less well represented.

Fig. 3
figure 3

Taxonomic distribution (class- and family-based assignment) of the four dominant phyla found in the three salt ponds sampled. Only families are considered that contained at least two OTUs in one of the three amplicon libraries

In the 12 %-salt pond, the Ciliophora dominate the taxonomic protistan OTUs [32 OTUs/27.8 %; mainly belonging to Climacostomidae (8 OTUs) and Vorticellidae (4 OTUs)], followed by the Stramenopiles (28 OTUs/24.3 %), of which the Bacillariophyta account for 19 OTUs [16.5 %; mainly Naviculaceae (9 OTUs), Hemiaulaceae and Catenulaceae (2 OTUs, respectively)] and the Chlorophyta [27 OTUs/23.5 %; mainly Dunaliellaceae (17 OTUs) and Chlamydomonadaceae (4 OTUs)]. Dinozoa [7 OTUs/6.1 %; mainly Oxyrrhinaceae and Peridiniaceae (2 OTUs, respectively)], Cryptophyta [5 OTUs/4.3 %; belonging to Pyrenomonadaceae (4 OTUs) and Geminigeraceae (1 OTU)], Cercozoa (2 OTUs/1.7 %; belonging to one Cercomonadidae and one unclassified Chlorarachniophyceae) and Amoebozoa [2 OTUs/1.7 %; belonging to Hartmannellidae and Flabellulidae (one OTU, respectively)] contributed only little to the detected OTU distribution.

In contrast to the 12 %-salt pond sample, most of the OTUs retrieved from the crystallizer pond sample blast to the Chlorophyta [30 OTUs/56.6 %; dominantly belonging to the Dunaliellaceae (24 OTUs) and Chlamydomonadaceae (3 OTUs)]. The Ciliophora are represented with 13 OTUs [24.5 %; mainly Vorticellidae and Loxocephalidae (2 OTUs, respectively)] and the Stramenopiles with 7 OTUs (13.2 %), of which the Bacillariophyta engage 6 OTUs [belonging to Naviculaceae (4 OTUs) and Bacillariacea (2 OTUs)]. Both, the Cryptophyta and the Dinozoa occur with only one OTU, belonging to Pyrenomonadaceae and Oxyrrhinaceae, respectively. Representatives of the Amoebozoa and Cercozoa were not detected in the crystallizer pond.

Comparing sequence abundance and FISH cell counts

The sequencing data shows that pyro-reads assigned to the phylum Ciliophora are very abundant in most samples, accounting for 16.5 % of all pyro-reads in the 4 %-salt pond and even 65,7 % in the 12 %-salt pond (Fig. 4). In the crystallizer pond (38 % salinity), however, only 0.2 % of the obtained reads affiliated with this taxon group. The proportion of ciliates identified via FISH in the same samples is much higher, accounting for 1 and 3.5 % of all protists in the 4- and 12 %-salt ponds, respectively. In the crystallizer pond sample ciliates were not observed.

Fig. 4
figure 4

Abundances of ciliates in each of the sampled salt ponds as detected by a FISH-cell counts and b pyrosequencing

Novel diversity

We observed numerous low-identity swarms (i.e., clustered reads) in all datasets, pointing to a high genetic novelty within all three habitats (Fig. 5). Thus, any of the three solar saltern ponds may hold a significant protistan novelty. Between 22.5 % (45 swarms, 4 %-salt pond) and 50.5 % (101 swarms, 12 %-salt pond) of the swarms showed a sequence similarity of less than 97 % to the reference database, and between 5 % (10 swarms, crystallizer pond) and 12 % (24 swarms, 12 %-salt pond) had an identity match <90 %. Interestingly, the highest degree of novel diversity was observed within the 12 %-salt pond (Fig. 5b). Here, the swarms with the lowest-identity (<97 % sequence similarity) were detected within the Ciliophora (59 swarms), the Stramenopiles (16 swarms) and the Chlorophytes (12 swarms). Swarms displaying an identity match of less than 90 % belonged to the Ciliophora and Stramenopiles (8 swarms, respectively), Amoebozoa and Choanoflagellates (3 swarms, respectively), Perkinsea and Fungi (1 swarm, respectively).

Fig. 5
figure 5

Analysis of novel diversity within the three sampled salt ponds. a 4 %-salt pond, b 12 %-salt pond and c 38 %-salt pond (crystallizer pond). Each dot represents one swarm (i.e., cluster) (for more information see methods part). The different colors indicate the level of sequence similarity to a deposited reference sequence; the size of a swarm indicates the number of including pyro-reads. An edge weight (sequence similarity) of 90 % was chosen to discriminate between the different swarms

Within the crystallizer pond sample, swarms belonging to the Chlorophyta (50 swarms), Ciliophora (5 swarms) and Apicomplexa (1 swarm) showed a sequence similarity of <97 % (Fig. 5c). Eight Chlorophyta-swarms and one Ciliophora- and Mesomycetozoa-swarm, respectively, were classified as putative extremely novel having an identity match of <90 % to any previously reported sequence.

The lowest degree of genetic novelty was discovered in the 4 %-salt pond (Fig. 5a). Here, we observed the highest novelty within the Stramenopiles with 22 swarms having a sequence similarity of <97 % and the Ciliophora (10 swarms). Swarms affiliating with the Cryptophyta (4 swarms), Chlorophyta (3 swarms) and Dinozoa (1 swarm) displayed an identity match between 90 and 97 %. Putative extremely novel eukaryotic reads were detected within the Stramenopiles (10 swarms), Cercozoa (2 swarms) and Fungi (1 swarm).

Discussion

Deep sequencing expands the range of protistan plankton diversity known from solar salterns

With almost saturated sampling profiles, our high-throughput sequencing study is the first to provide a detailed description of the protistan plankton inventory along a salt gradient in solar saltern ponds. Our study revealed a broad repertoire of protists from every major eukaryotic lineage and expands the range of protists known to live in high-salt environments. The nature of the study design, however, does not allow for a stringent quantitative assessment of species abundances, since sequence numbers and artificial OTU classifications based on compromise sequence similarity values, respectively, do not reflect real protistan species numbers (Zhu et al. 2005; Lynn 2008; Amend et al. 2010; Medinger et al. 2010). The substantial incongruence regarding gene amplicon abundances and morphospecies abundances in an environmental sample strongly supports the recommendation to use incidence-based statistics for community structure analyses and diversity partitioning rather than abundance-based statistics (Dunthorn et al. 2014a). Reasons for overestimated ciliate abundances are twofold: even the smallest ciliates observed in our samples (ca. 10 µm) are still larger than most flagellates in these samples. As demonstrated, there is a significantly positive correlation between SSU rDNA copy number and cell size (Zhu et al. 2005), the 18S rDNA copy number of one individual ciliate outnumbers the rDNA copy numbers of most (if not all) individual flagellates present in a sample. Furthermore, as reviewed in Dunthorn et al. (2014b) ciliates have numerous and highly variable genome copy numbers. Within ciliates, differential processing of macronuclear chromosomes can lead to hundreds of DNA copies in some taxa (e.g., Colpodea, Litostomatea, Prostomatea, and Oligohymenophorea), while leading to thousands of copies in other taxa (e.g., Phyllopharyngea and Spirotrichea). Dunthorn et al. (2014b) make the consequences clear in a simple example: if you sample a freshwater pond morphologically you might find one Halteria grandinella (Spirotrichea) for every ten Coleps hirtus (Prostomatea), but molecular estimates could tilt toward a 1:1 abundance pattern. This example nicely demonstrates that reconciling gene amplicon numbers from sequencing of an environmental sample is incongruent to organismic numbers in this sample. We here provided taxon count data (FISH with ciliate specific probe) along with our sequence data to verify these previous assumptions of incongruencies between gene data and morphotype data.

Overall, ciliates account for the highest proportion of the protist diversity in the salt ponds of the Ria Formosa lagoon (Fig. 3). This success of many ciliates in halo-adaptation is confirmed by several microscopy studies (Elloumi et al. 2006; Cho et al. 2008; Elloumi et al. 2009a; Lei et al. 2009, Shao et al. 2014, Foissner et al. 2014a, b). However, the differences observed in ciliate diversity in individual solar salterns are dramatic: In the Sfax solar saltern ponds Elloumi et al. (2006) recorded a total of 26 different ciliate morphospecies, lacking stichotrichs and phylopharyngeans. By contrast, Lei et al. (2009) discovered 98 morphospecies in ponds of a solar saltern works at the Yellow Sea coast. Stichotrichs and hypotrichs accounted for high quantitative proportions in these ponds. No species of the classes Colpodea, Karyorelictea and Armophorea were detected in contrast to the Sfax saltern ponds (Elloumi et al. 2006), although these taxa are known from low salt environments (Lynn 2008; Dunthorn et al. 2014a). Triadó-Margarit and Casamayor (2013) found ciliate-affiliated SSU rDNA sequences from 34 different sampling sites from five classes: Spirotrichea, Oligohymenophorea, Prostomatea, Heterotrichea and Litostomatea. Another molecular study by Heidelberg et al. (2013) detected 14 ciliate 18S rRNA sequences in the hypersaline Lake Tyrrell, Australia, being related to a heterotrich, a plagiophylean and a stichotrich. We discovered 86 OTUs from seven of the eleven described ciliate classes (Fig. 3, Online Resource 1) (Lynn 2008) in the Ria Formosa solar salterns. The highest diversity with 10 ciliate families was found in the class Oligohymenophorea (see Fig. 2 and Online Resource 1).

Our FISH observations of samples from the Ria Formosa salt ponds revealed the highest relative abundance of ciliates in 12 % salinity (Fig. 4). Elloumi et al. (2009a) reported similar results, observing the highest proportion of ciliates in a 10 %-salt pond in a solar saltern in Sfax, Tunesia, compared to other salt ponds in the same solar saltern works. This increased relative biomass could be a result of a combination of successful haloadaptation in this lineage in combination with copious nutritional supply, specifically for larger ciliates. Such nutritional supply may come from the small chlorophyte Dunaliella, which provides a high-quality food source for ciliates, including Fabrea salina (Post et al. 1983; Pandey and Yeragi 2004). Indeed, Fabrea (family Climacostomidae) makes up about one-third of the total ciliates sequences in the 12 %-salt sample and we microscopically observed large numbers of F. salina with ingested Dunaliella-like cells. By contrast, we did not find any Fabrea-like sequences (nor cells) in the 4 %-salt sample. An increase in relative ciliate abundance from the 4 %-salt pond towards the 12 %-salt pond comes along with a decrease in ciliate diversity (Figs. 3, 4), corroborating previous microscopy studies (Elloumi et al. 2006, 2009a; Lei et al. 2009). The decrease in ciliate diversity may be due to the highly variable salinity tolerances within ciliate groups (see also Lei et al. 2009). While some ciliate families like Didiniidae, Mesodiniidae, Codonellidae and Codonellopsidae disappear at salinities higher than 4 %, other more specialized families like Acinetidae or Frontoniidae first occur at salinities above 12 %. Only few ciliate groups are able to physiologically cope with elevated salt concentrations and therefore only the best-adapted groups survive. These ciliates, additionally, benefit from the reduced competition for the same food resources (most ciliates are phagotrophs) through other ciliate groups and therefore can develop larger populations of individual successful species. It cannot be excluded that other, co-varying, environmental parameters, such as temperature and pH can also have a significant influence on community composition and structure.

Changes in other protistan taxa along salinity gradients are mainly discussed as a “black box” in the current literature without going into details of specific taxonomic resolution. In Sfax solar salterns, diatoms account for ca. 30 % of the total phytoplankton biomass in the 4.5 % salt pond, and dinoflagellates for the vast majority of the biomass in ponds with ca. 10 % salt (Elloumi et al. 2009a). A dramatic change occurs in ponds with higher salt concentration in Sfax, where exclusively chlorophytes (Dunaliella) accounted for eukaryote (phyto)plankton (Elloumi et al. 2009a). Pigment analyses and inverted microscopy conducted in solar salterns in Spain largely agree with these principal community changes of phytoplankton along a salt gradient from 4 to 37 % salt (Estrada et al. 2004). In the same study, cryptophytes were also identified as major primary producers in ponds with 4–5.4 % salt, but could not be recorded in Sfax solar salterns (Elloumi et al. 2009a, b). Our deep sequencing study confirms that the diversity of these taxon groups peaks at the 4 % salt pond with a dramatic drop in diversity at higher salinities. However, in the crystallizer pond, we still recorded a dinoflagellate OTU from Oxyrrhinaceae, a cryptophyte OTU from Pyronomonadaceae and diatom OTUs from Bacillariaceae and Naviculales. Even though represented with only three families, the chlorophytes contribute to a major part to the protistan plankton community in the crystallizer pond. Differences in protistan community structures in ponds of different salinity became also obvious in the few molecular-based protistan diversity studies (Casamayor et al. 2002, 2013; Triadó-Margarit and Casamayor 2013). Unfortunately, quantitative assessments of these changes for individual evolutionary lineages cannot be inferred from these studies.

Until now, numerous small heterotrophic flagellates have been reported from high-saline ponds by microscopic analyses (Ruinen 1938; Post et al. 1983; Patterson and Simpson 1996; Park et al. 2007; Park and Simpson 2010). Patterson and Simpson (1996) alone identified 17 species from a pond of 6 % salt, becoming as rare as five species in a saturated pond. Interestingly, the majority of the described new species in this study originated from hypersaline sites. Our pyrosequencing study reveals a similarly high diversity of these organisms and even though it is assumed that most heterotrophic flagellates disappear above a salinity of 25 % (Pedrós-Alió 2004), this and other studies indicate that the diversity of these microorganisms in hypersaline habitats is strongly underestimated (Park et al. 2003; Stock et al. 2012). Our study and also previous studies (Casamayor et al. 2013; Triadó-Margarit and Casamayor 2013; Heidelberg et al. 2013) clearly question the traditional view that most extreme environments on Earth harbor a relatively low diversity of eukaryotic microorganisms, mostly restricted to few evolutionary lineages like some algae, heterotrophic flagellates and fungi (Weber et al. 2007).

The only minor increase of observed diversity compared to results of microscopy-based or Sanger sequencing studies can be attributed to the following reasons: (1) the choice of PCR-primers can introduce a bias due to their selectivity (Stoeck et al. 2006, 2010; Jumpponen 2007; Engelbrektson et al. 2010). (2) the V4 regions contain a large diversity of secondary structure types (Wuyts et al. 2000). Large introns in the V4 region result in complex secondary structures that may resist pyrosequencing. The ciliate Euplotes, for example, has such introns in the V4 region and remained undetected in this dataset (but could be revealed by Sanger sequencing of the same sample, unpublished data). (3) Using genetic diversity markers does, furthermore, not always allow translating OTUs reliably into taxonomic units (Caron et al. 2009; Nebel et al. 2010). While for example 2–3 % sequence divergence in the V4 region of ciliates accounts for intraspecific gene heterogeneity (Nebel et al. 2010; Dunthorn et al. 2012), a dissimilarity of less than 1 % distinguishes already different species or genera in other taxon groups such as dinoflagellates (Dinophysis, Ki 2012) or cryptophytes (personal communication, Dr. Lucie Bittner). (4) Available databases are incomplete, especially when it comes to organisms living in extreme environments. Although described morphologically, for some of the organisms no sequence data is deposited in the databases and, therefore, will be missed in sequence data analyses.

Novel diversity in solar salterns

Investigating genetic novelty in molecular datasets is an ascending subject in diversity research (Lynch et al. 2012; Casamayor et al. 2013; Triadó-Margarit and Casamayor 2013). Several approaches exist to target this issue: while Casamayor et al. (2013) and Triadó-Margarit and Casamayor (2013) explored genetic novelty of their DGGE-sequences by BLAST identity searches, retaining and relating the closest environmental match and the closest cultured match, Lynch et al. (2012) proposed a strategy suitable for high-throughput data. They conducted network analyses of their bacterial data, combining taxonomic affiliation, sequence similarity to a BLAST hit, as well as sequence similarity information between sequences and were able to identify numerous sequence clusters of phylogenetic novelty. In our study, we modified this approach using the novel clustering algorithm Swarm (Mahé et al. 2014). Different from popular de novo clustering methods, Swarm does not rely on arbitrary global clustering thresholds (since lineages evolve at varying rates, no single sequence similarity value can accommodate the entire tree of life) and input-order dependency. Instead it uses a local clustering threshold to assign amplicons to an OTU. OTUs grow iteratively by comparing each generation of assigned amplicons to the remaining amplicons. An OTU is closed when no new amplicon can be integrated in the OTU [for more detailed information see Mahé et al. (2014)]. This strategy produces robust and high-resolution OTUs that allow for accurate, meaningful and interpretable biological results.

Our approach revealed a high degree of genetic novelty of salt-adapted protists at moderately saline (4 %) and hypersaline (12, 38 %) conditions. Previous molecular studies already reported that saline lakes represent important reservoirs of organisms that thus far had not been detected elsewhere (Casamayor et al. 2013; Triadó-Margarit and Casamayor 2013). In their DGGE-study Triadó-Margarit and Casamayor (2013) detected the highest novelty within the Fungi, Choanoflagellida, Cercozoa, Prasinophyceae and Telonemida for less saline ponds (salinity <6.5 %) and within the Choanoflagellida, Bicosoecida, Cercozoa, Fungi, Centroheliozoa, Trebouxiophyceae and Centroheliozoa for hypersaline ponds (salinity >6.5 %). Analyzing only 73 phylotypes, they, however, just captured the tip of the iceberg of an as yet undetected complex protistan community. The present survey, therefore, contributes to reveal the full extend of the yet unseen microbial diversity within one specific solar saltern works. Our deep-sequencing approach uncovered putative novel organisms within almost every major eukaryotic taxonomic group, extending the list provided by Triadó-Margarit and Casamayor (2013) with novel sequences found within the Ciliophora, Dinozoa, Stramenopiles, Cryptophytes and Chlorophytes in the less saline pond (4 % salinity) and within the Ciliophora, Amoebozoa, Stramenopiles, Perkinsea and Mesomycetozoa in the hypersaline ponds (>12 % salinity). The observed degree of novelty was neither evenly distributed among the different sampled salinities, nor among the different taxa. It is not surprising to find the highest novelty level at an intermediate salt concentration of 12 %, considering that hypersaline environments only recently gained scientific attention, whereas marine environments have been very well-studied for a long time. Such that, public databases contain a broad accumulation of sequence data derived from marine environments, but currently lack a comparable amount of information from hypersaline habitats. Due to the low diversity in the crystallizer pond (compared to the observed diversity in the 4- and 12 %-salt ponds), it is also not surprising to find a reduced level of novel diversity in this sample.

We not only observe such high genetic divergences in small swarms (i.e., clusters with <200 reads), representing low-abundant taxa, which often escape microscopic observations (Lynch et al. 2012), but also in large swarms (>200 reads) being related to ciliate, chlorophyte and stramenopile species. This finding exposes pronounced gaps in the available reference databases, and also demonstrates how far we still are from knowing the complete protistan inventory in such extreme environments. Further efforts in sampling, isolation, cultivation and descriptions are therefore necessary to capture the complete protistan repertoire in high-saline environments. Additionally, more efforts in the development of new strategies to target novel and unknown organisms in samples should be conducted (e.g., design of species-specific probes or primers based on retrieved sequence information to “catch” novel organisms and their full-length 18S rDNA gene, respectively).

Transition boundaries within an extreme salt gradient

Marine-freshwater transition boundaries for protists are well-known from the literature [see review of Logares et al. (2009); Forster et al. (2012)]. From our results we can conclude that also within a salt gradient a transition boundary exists, clearly separating the protistan community thriving in 4 % salt from the communities in the 12 and 38 % salt ponds. A salinity of 15 % was reported previously as a threshold for plankton organisms along a salinity gradient of the salt works of Sfax using microscopy analyses (Elloumi et al. 2009a). Using molecular fingerprinting (T-RFLP and DGGE), Casamayor et al. (2002) found two groups of eukaryote diversity, one of which was selected in the marine salinity range (4–5 % salinity) and another group above 8 % salinity. A similar pattern applies to prokaryotes, which were selected at 15 % salt in a Spanish solar saltern (Casamayor et al. 2002).

Salt concentrations in this magnitude (8–15 %) seem to select specific eukaryotic halophiles adapted to high-salt environments. For halophilic bacteria, two fundamentally different processes as adaptations to high-salt environments are described [reviewed in Oren (2008)]. The “salt-in-strategy” involves the intracellular accumulation of molar concentrations of chloride and potassium. Because proteins need to retain their functional conformation and activity at high intracellular salt concentrations, massive adaptations of the enzymatic machinery are mandatory (Lanyi 1974). The proteomes of halophilic microorganisms with a salt-in-strategy are highly acidic and denature when suspended in low salt. If such a strategy is also common in microbial eukaryotes, this could explain why a range of taxa that are adapted to higher salinities (12 % and above) cannot survive in environments with lower salt concentrations. Thus far, in microbial eukaryotes only three different salt-tolerant fungi are described that, among others, have acidic proteomes as adaptation to higher salt concentrations (Gostinčar et al. 2011; Kis-Papo et al. 2014). Metabolic constraints that prevent many protists to cross this environmental barrier from lower to higher salt concentrations are unknown.

Further evidence for this hypothesis comes from the observation of the salt tolerances of several protists, such as flagellates (e.g., Hauer and Rogerson 2005; Park et al. 2007; Park and Simpson 2010), amoebae (e.g., Park et al. 2009; Hauer and Rogerson 2005) or ciliates (e.g., Elloumi et al. 2009a; Lei et al. 2009; Foissner et al. 2014b). The available data from ciliates demonstrate a growth either below 8 % salt or above (e.g., 8–30 %). Only few taxa, such as the ciliates Balanion, Cladotricha, Fabrea or Condylostoma cross this boundary (Elloumi et al. 2009a; Lei et al. 2009). While no V4 read data are publicly available for the first two taxa, both, Fabrea-like and Condylostoma-like sequences were detected in all three salt ponds. Furthermore, resident and therefore well-adapted protists in high-salt environments may hinder invasive species from settling and establishing influential communities as reported for multicellular organisms (Vermeij and Dudely 2000).

In conclusion, we infer from this deep-sequencing study that protistan diversity in high-salt ponds is actually much higher than proposed previously by molecular studies, with numerous taxonomic groups also present in crystallizer ponds. Interesting observations are the incongruences found among the few protistan diversity studies in solar saltern ponds. These may partly be due to different methodologies applied to uncover protistan diversity and to different units to measure protistan diversity along with the different strengths and shortcomings of these methods and measures [discussed in detail in Stoeck et al. (2014)]. But also true biological difference may contribute to this observation, such as specific biogeographic patterns. An in-depth analysis of protistan community structures in different solar saltern works and other hypersaline habitats around the globe will shed light on possible geography structuring of protistan plankton communities thriving in high-salt environments. Because of the time constraints and the shortcoming of microscopy studies to predominantly record described and most abundant species in ecological studies, deep-sequencing is probably the best method of choice to follow-up on this interesting subject. With the present study, we have set the cornerstone and a benchmark for following comparative genetic diversity surveys in solar salterns.