Introduction

Warming in the Arctic has been much faster than in the rest of the world (Ono et al. 2022), a phenomenon referred to as the Arctic amplification (Serreze and Francis 2006). Among others, the ongoing climate changes can lead to shifts in sea ice age and thickness (Stroeve and Notz 2018). The White Sea, a semi-enclosed southern inlet of the Barents Sea, is facing the same challenges.

Within sea ice, amidst the intricate lattice of ice crystals, eukaryotic microorganisms can find a place where they can thrive. The brine channels within the ice matrix are particularly welcoming to protists. Created during the ice formation process as salt is excluded, these brine channels offer everything necessary for microbial life (Berger et al. 2001; Filatov et al. 2007). Currently in the Arctic, multi-year sea ice is being replaced by first-year ice, which is more active and contains more nutrients. This process may lead to community changes in the sea ice habitats that favor picoeukaryotes, e.g., heterotrophic stramenopiles, a phenomenon demonstrated in phytoplankton community studies (Morán et al. 2010; Acevedo-Trejos et al. 2014; Swart et al. 2015). Thus, picoeukaryotes could prevail over larger protists with the expansion of the first-year ice areas.

There are very few studies focusing on picoeukaryotes in Arctic and subarctic sea ice. Previous investigation of pico-sized autotrophs revealed a diverse algal community, including several taxa not previously observed in ice (Belevich et al. 2018). Picoautotrophs are important primary producers which are increasing their abundance in recent years, while abundance of the nanophytoplankton is decreasing (Li et al. 2009). They make a significant contribution to the energy flows and food chains in aquatic ecosystems (Fenchel 1988; Sanders et al. 1989; Foissner 1991). Heterotrophic protists, particularly flagellates, can control bacterial communities in marine ecosystems (Fenchel 1988). Investigations on sea ice protist diversity are relatively limited and largely rely on traditional techniques such as light microscopy, epifluorescence microscopy, flow cytometry, and fluorescence in situ hybridization (FISH). While these methods are valuable tools for studying protist communities, it is important to note that they may have limitations in regard to assessing the entire diversity of heterotrophic picoeukaryotes. The challenges arise from the inherent complexity and potential inaccuracies associated with these procedures, especially when applied to small organisms. These limitations have, to some extent, constrained the scale of previous studies (Pawlowski et al. 2018, 2022). Nowadays, the metabarcoding approach has revolutionized our understanding of microbial diversity with numerous studies employing this method (Kilias et al. 2014; Meshram et al. 2017; Belevich et al. 2018). It was shown that metabarcoding can identify many more planktonic organisms in comparison to traditional microscopy (de Vargas et al. 2015).

The White Sea is located in the subarctic zone, but its abiotic conditions are characterized by a number of features characteristic of the Arctic seas, e.g., low insolation and polar nights in winter, strong riverine discharge, pronounced seasonality of biological processes, and the formation of first-year ice for a long period of time (up to 6 months), while experiencing long days with the midnight sun during summer (Berger et al. 2001). The White Sea serves as a unique and dynamic environment for a diverse range of heterotrophic protists. Its hydrography is influenced by various factors, including inflowing rivers, temperature gradients, and seasonal ice cover. This distinctive combination of factors contributes to the formation of distinct ecological niches within the White Sea, making it an interesting subject for studying the microbial communities of heterotrophic protists (Filatov et al. 2007). Here we report the results of a metabarcoding study of heterotrophic picoeukaryote diversity in the sea ice of the Kandalaksha Gulf, the White Sea, Russia, by Illumina high-throughput sequencing of the 18S rRNA V4 gene region. Taking into account the inaccessibility of the Arctic seas in winter, which limits the possibility of studying the first-year ice biota, the results obtained for the White Sea can supplement the scarce data available on the diversity of sea ice heterotrophic protists.

Materials and methods

Sampling and study area

The sampling was carried out on 16–19 March 2014 in the Kandalaksha Gulf, the White Sea (Russia), near the White Sea Biological Station (WSBS), Lomonosov Moscow State University (66 33’ N, 33 06’ E). A total of four land-fast sea ice stations located in the Rugozerskaya Inlet-Velikaya Salma Strait system were visited by snowmobile (Fig. 1). The distance between stations did not exceed 10 km. At each station, three to five cores were collected using a manual titanium ice corer (14 cm diameter, TM “Avtomatica,” Russia). The ice temperature was measured immediately after sampling by a probe Testo 108 (Testo, Lenzkirch, Germany). The length of the cores varied from 0.45 to 0.58 m between the stations. All cores (15 in total) were placed in sterile black Whirlpack bags and brought to the WSBS laboratory where one core from every station was melted without addition of seawater and processed for salinity analysis by conductivity probe (Cond 3150i, WTW). The remaining two to four cores were cut into several parts (upper 10 cm, middle part, which varied from 35 to 45 cm, and lower 10 cm) with the use of a stainless steel saw. Ice melting took place in a dark isothermal container with the addition of 0.22-μm-filtered surface seawater collected at the time of sampling to minimize the osmotic shock effect on microorganisms during melting (Garrison and Buck 1986). The samples were melted at room temperature for approximately 20 h to decrease the effects of biological processes such as growth, predation, and degradation, as recommended by Rintala et al. (2014). The air temperature data from the weather station at the White Sea Biological Station were also recorded (Table 1).

Fig. 1
figure 1

Sampling sites 1–4 in the Kandalaksha Gulf, the White Sea in the vicinity of the White Sea Biological Station of Lomonosov Moscow State University (WSBS MSU)

Table 1 Characteristics of ice cores at the sampling stations (Tice, Tair, and Twater—ice, air, and under-ice water temperature, °C; Sice and Swater—ice and under-ice water salinity, ‰; depth—depth of water column, m)

Enumeration of heterotrophic picoeukaryotes

Epifluorescence microscopy was used to determine the abundance of protists and differentiate between photo- and heterotrophic eukaryotes based on the presence or absence of chlorophyll (Chl) autofluorescence (Bloem et al. 1986). To separate pico-sized organisms, water samples from each melted section of the ice core were filtered through a 2-μm-pore-size polycarbonate filter using reverse filtration (50 mmHg vacuum). The abundance of heterotrophic picoeukaryotes was determined using the fluorochrome primulin and membrane filters (0.12 μm pore diameter) prestained with Sudan black (Bloem et al. 1986). At WSBS, the cells were counted under a Leica DM2500 epifluorescence microscope equipped with a digital camera and imaging software from Leica Microsystems at × 100 × 10 × 1.3 magnification. Depending on the concentration of the cells, 30 to 50 fields were examined. We used a volume-to-carbon conversion factor of 220 fg C/µm3 (Borsheim and Bratbak 1987). Cell abundance determined from ice samples was corrected for added seawater using a dilution factor ranging from 1.3 to 2.4.

DNA isolation

Surface, middle, and bottom sections of each melted ice core were combined, and then 3–5 L of water was also filtered through a 2-μm-pore-size polycarbonate filter using reverse filtration. Then, the obtained filtrate was sedimented on 0.2-μm Sterivex units (Millipore Canada Ltd., Mississauga, ON, Canada). Every unit was filled with a buffer (1.8 mL of 50 mM Tris–HCl, 0.75 M sucrose, and 40 mM EDTA; pH 8.3); thereafter, units were frozen and kept at − 80 °C until DNA extraction. Genomic DNA was extracted with the NucleoSpin Plant kit (Macherey–Nagel, Düren, Germany) following the manufacturer’s instructions.

DNA amplification and sequencing

Illumina MiSeq technology was used to analyze the V4 hypervariable region of the SSU ribosomal DNA. This region can describe the cryptic diversity of microeukaryotes even among the smallest protists (López-García et al. 2001; Moon-van der Staay et al. 2001). To amplify the V4 region, the following primer pair was used: forward EuF-V4 (5′-CCAGCASCCGCGGTAATWCC-3′) (Elwood et al. 1985) and reverse picoR2 (5′AKCCCCYAACTTTCGTTCTTGAT-3′) (Belevich et al. 2018). To perform the PCR procedure, the Encyclo Plus PCR kit (Evrogen, Russia) was used. The PCR amplification was carried out under the following conditions: first, an initial denaturation step for 3 min at 94° succeeded by 30 cycles (denaturation at 94° for 20 s, annealing at 55°, 60°, or 65° for 20 s, and extension at 72° for 40 s), followed by a final extension at 72° for 5 min. PCR products obtained at three different annealing temperatures were combined. Cleaning of amplicons was performed with a Cleanup Mini kit (Evrogen, Russia). Amplicons were used to prepare the libraries for Illumina tag sequencing.

TruSeq Nano DNA Kit (Illumina, USA) was used to construct paired-end libraries of the amplicons. To test the effective concentration of the libraries, primers I-qPCR-1.1 (5′AATGATACGGCGACCACCGAGAT-3′) and I-qPCR-2.1 (5′CAAGCAGAAGACGGCATACGA-3′) (Takahashi et al. 2014) were used with the library PhiX Control v3 (Illumina) as a positive control. Libraries were diluted to 12 pM. The sequencing process was performed with MiSeq Reagent Kit v.2 for 500 cycles in an Illumina MiSeq sequencer (Lomonosov Moscow State University). Sequenced 250 bp paired-end reads were sufficient to cover the length of the V4 region.

18S sequence processing

Raw reads were trimmed from adapter sequences using Trimmomatic (Bolger et al. 2014). CutAdapt was used to cut primer sequences and to discard sequences without primers (Martin 2011). Forward and reverse read pairs were merged using USEARCH 10 (Edgar 2010). Quality filtering was applied with the USEARCH 10 command ‘fastq_filter’ with total expected errors = 1 (fastq_maxee 1). Merged reads shorter than 300 bp or greater than 500 bp were removed at that step. For sequence visualization and quality checking, FastQC tool was used at each of the previous steps. USEARCH 10 command ‘cluster_otus’ and UPARSE algorithm (Edgar 2013) have been used for de novo OTU clustering with standard clusterization radius (r = 97%) and to identify possible chimeric sequences. Then, USEARCH 10 command ‘otutab’ was used to make an operational taxonomic unit (OTU) table with an identity threshold of 97%. Taxonomy assignment was carried out using the SINTAX algorithm. PR2 4.12.0 was used as a reference database. Sequences belonging to non-protist phyla were deleted. After taxonomy assignment, all unclassified domains and phylum-level OTUs were manually classified by BLASTN with standard parameters. All taxa not belonging to heterotrophic and mixotrophic protists according to the literature review were excluded from the analysis. OTUs with singletons and doubletons were also removed to minimize impact of likely spurious sequences.

The visualizations were performed using R packages ggplot2 (Wickham 2011) and phyloseq (McMurdie and Holmes 2013). The number of reads on each station was rarefied by the value on the station with the smallest number using the phyloseq function ‘rarefy_even_depth’.

Maximum likelihood (ML) method realized in IQ-TREE multicore version 1.6.1 was used for phylogenetic tree reconstruction (Nguyen et al. 2015). Multiple sequences alignment tool MUSCLE (Edgar 2004) was used to align OTU sequences. Best substitution model was tested with the jModelTest (Guindon and Gascuel 2003; Darriba et al. 2012) tool. According to testing, the best-fit model (GTR + I + G) was chosen. IQ-TREE analysis was performed with 1000 standard bootstrap replications. Additionally, the PR2 4.12.0 database was used to obtain Cercozoa sequences for analysis of the unknown Cercozoa clade in the data.

Results

Enumeration of heterotrophic picoeukaryotes and results of sequence processing

The total abundance of heterotrophic picoeukaryotes in all ice cores ranged from 66 to 182 × 103 cell l−1 between the stations (on average 129 ± 48 × 103 cell l−1), and biomass varied slightly from 0.22 to 0.28 µg C l−1 (on average 0.26 ± 0.03 µg C l−1) (Table 2). The vertical distribution of heterotrophic protists averaged for four stations showed the lowest abundance in the upper part of the cores. Biomass did not differ significantly between the layers (Fig. 2).

Table 2 Abundance (N) and biomass (B) of heterotrophic picoeukaryotes and number of generated reads and OTUs
Fig. 2
figure 2

Abundance (N × 1000 cell l−1) and biomass (B, µg C l−1) of heterotrophic picoeukaryotes in sea ice (average for four stations). Error bars indicate the standard deviation of the mean value

A total of 137,940 reads were sequenced from 4 sampling sites. Among them, 27,030 reads passed through quality filtering and preprocessing procedures. Filtered reads were clustered into 172 OTUs at 97% similarity. Twenty chimeric sequences were identified and discarded from the dataset (Table 2). Seven OTUs remained unclassified at the domain level after the taxonomy assignment. After reassigning with BlastN, only 3 OTUs remained unclassified at the domain level. However, we can assume that they belong to Cryptista, Rhizaria, and Stramenopiles according to the phylogenetic tree (Fig. 3). In total, 121 OTUs belonging to heterotrophic and mixotrophic protists were revealed in the dataset after the removal of exclusively photosynthetic taxa.

Fig. 3
figure 3

ML phylogenetic tree of the OTUs of heterotrophic picoeukaryotes in the ice of the White Sea. Branch nodes show standard bootstrap support values. The length of the node marked with the blue star was divided by 10. The letters before the colon show the last assigned taxonomic rank for OTUs (k kingdom, d domain, p phylum, c class)

OTU richness and taxonomic affiliation of the heterotrophic picoeukaryotes

Revealed OTUs corresponded to seven major domains of eukaryotes: Stramenopiles, Alveolata, Rhizaria, Cryptista, Haptista, Apusozoa, and Opisthokonta. Based on the relative read abundance (RRA), the most dominant groups over all stations were Rhizaria (48% of total RRA), Stramenopiles (19%), and Alveolata (16%) (Fig. 4).

Fig. 4
figure 4

Relative read abundance at the level of protist domains in the ice of the White Sea. The OTUs that have not been assigned to the domain from the PR2 database are marked as “Eukaryota_unclassified”

Rhizaria was the most dominant domain of heterotrophic protists among all stations, with higher RRA in stations 3 and 4 (74% and 57% of RRA respectively), and was represented only by Cercozoa. From 6 to 34% of the Cercozoa RRA in the samples belonged to cold-water and ice-dwelling Cryomonadida (Fig. 5). Besides Cryomonadida, the proportions of phylotypes identified to the order level were 16% for the first station and 7%, 17% and 12% for stations 2, 3, and 4, respectively. Interestingly, some unclassified Cercozoa formed a distinct clade on the phylogenetic tree (Fig. 2, red square, Fig. S1, with a known Cercozoa sequences subset). This clade made up between 77 and 94% of the reads among all unknown Cercozoa in the samples. Most likely, a previously unstudied clade inside Cercozoa was found in this survey with the closest NCBI match of an uncultured eukaryote clone (accession no. KT815603.1, Query Cover > 97%, Per Ident > 94%) revealed in Isfjorden, West Spitsbergen (Marquardt et al. 2016).

Fig. 5
figure 5

Rhizaria relative read abundance in the ice of the White Sea. The OTUs that have not been assigned to order rank from the PR2 database are marked as “unclassified” in the name of the last assigned taxonomic rank

Stramenopiles domain was mainly represented by Ochrophyta. Ochrophyta was constituted almost entirely by Chrysophyceae and unclassified taxa (Fig. 6). Chrysophyceae accounted for more than 80% of all RRA at the class level for stations 1, 2, 4, and 68% for station 3 (Fig. 6), and also accounted for more than 90% of the total representation of Ochrophyta in the samples. It is worth noting that the parasitic genus Pirsonia was also found in the pico-fraction at stations 1 and 2, accounting for 6% and 0.3% (Fig. 6) of the RRA of Stramenopiles, respectively.

Fig. 6
figure 6

Stramenopile relative read abundance in the ice of the White Sea. The OTUs that have not been assigned to class rank from the PR2 database are marked as “unclassified” in the name of the last assigned taxonomic rank

Alveolata was also an abundant domain (Fig. 4) and included OTUs of Ciliophora, Dinoflagellata, and Perkinsea (Fig. 7). The proportions of Dinoflagellata (Dinophyceae, Syndiniales) and Ciliophora (Spirotrichea, Oligohymenophorea), as shown by the RRA, were ~ 60% and 30%, respectively, at stations 1, 2, and 4. Ciliophora and Dinoflagellata accounted for ~ 60% and ~ 30%, respectively on the station 3. The most dominant were Spirotrichea (Ciliophora) and Dinophyceae (Dinoflagellata), accounting for over 50% of the RRA for each sample (Fig. 7). Perkinsea constituted less than 1.5% of the RRA in each sample and was represented by just one OTU.

Fig. 7
figure 7

Alveolata relative read abundance in the ice of the White Sea. The OTUs that have not been assigned to class rank from the PR2 database are marked as “unclassified” in the name of the last assigned taxonomic rank

Cryptista domain accounted for 12%, 8%, 4%, and 6% of RRA at stations 1, 2, 3, and 4, respectively (Table 3). Identified Cryptista OTUs include representatives of Cryptophyta, Picozoa, and Telonemia according to PR2 database taxonomy.

Table 3 Relative abundance (%) of reads found for different taxonomic groups of heterotrophic picoeukaryotes based on the V4 region of the 18S rRNA gene sequences. Taxonomy is given by PR2 database annotations

Opisthokonta were represented only by bacterivorous filter-feeding Acanthoecida (loricate choanoflagellates) and accounted for 7% of the RRA at station 1 but less than 1% at the rest of the stations (Table 3). Haptista included OTUs of Centroheliozoa and Haptophyta and accounted for less than 1% at all stations, with the exception of station 3 (Table 3). Apusozoa was one of the least represented domains, accounting for 0.06% – 0.5% of the RRA in the samples (Table 3).

Discussion

Taxonomic composition of heterotrophic picoeukaryotes

During the freezing process in marine waters, salt is excluded from the ice crystal structure. The salinity of sea ice in sampling locations in the White Sea ranged from 0.7 to 3.5‰ (Table 1) and was relatively similar to the average salinity of Arctic ice (0–5‰). However, the salinity of the surrounding water varied from 15 to 26‰ in the White Sea, whereas the water in the Arctic is generally saltier (with a salinity of around 30‰), which could influence the initial pool of microorganisms. The main difference in salinity between the sampling locations and overall between the White Sea and the Arctic lies in the fact that the White Sea is semi-enclosed, has limited exchange with the saltier waters of the open ocean, and receives freshwater input from rivers (Berger et al. 2001; Filatov et al. 2007).

The abundance and biomass of pico-sized heterotrophic organisms in the sea ice of the White Sea have been obtained for the first time. Previously, Sazhin (2004) estimated the parameters of the abundance of ice-dwelling nano-sized heterotrophic protists (< 20 µm). The obtained abundance of heterotrophic picoeukaryotes was comparable to the numbers of small heterotrophic flagellates found in the ice of Kandalaksha Bay (White Sea) in April 2002 (~ 120 × 103 cell l−1) (Sazhin 2004). At the same time, the estimated biomass of heterotrophic picoeukaryotes was an order of magnitude lower than the biomass of the cells in the nanofraction (9 µg C l−1) studied by Sazhin (2004). The comparison of obtained data with the photosynthetic picoeukaryote abundance (Belevich et al. 2018) showed that the abundance of heterotrophic organisms found by epifluorescence microscopy exceeded 60% of the total picoeukaryote abundance. This is consistent with the data obtained by Comeau et al. (2013) for Baltic Sea ice, where protist communities were surveyed by high-throughput tag sequencing of the V4 variable region of the 18S rRNA gene. It is important to note that there are factors that could potentially affect biomass and abundance estimations. One major concern relates to the time it takes for ice samples to melt. To minimize biological processes such as growth, death, and predation, we rapidly melted the samples at room temperature within 20 h. However, it cannot be ruled out that even during such a short period, the abundance and composition of the community of heterotrophic picoeukaryotes could change, potentially leading to shifts in our biomass and abundance estimations.

This 18S DNA-based metabarcoding research has shown that the entire Rhizaria diversity in the studied dataset was represented by the phylum Cercozoa. Among the cercozoans, there is a predominantly cold-water and ice-inhabiting protist of the order Cryomonadida. However, cryomonadids constitute only part of Cercozoa abundance in the samples (Fig. 5). Most Cercozoa are amoeboid and predominantly benthic organisms and should not be present in ice in high abundance at first glance. However, for example, sea ice is inhabited mainly by benthic types of ciliates such as Euplotidae (Agatha et al. 1993), and most likely, sea ice habitats with narrow brine channels and large surface areas within brine channels are well-suited for surface-oriented benthic protists. However, it is difficult to explain such dominance of Cercozoa, since a significant part of the Cercozoa OTUs have not been taxonomically assigned and cannot be interpreted. Previous surveys in the North of Svalbard and in the Central Arctic Ocean also found large numbers of Cercozoa (Hardge et al. 2017; Meshram et al. 2017), but they were not the most dominant group, and their dominance in our dataset remains unclear. Many Cercozoa are fairly large protists, and it is difficult to explain their predominance in ice samples and in the pico fraction. The proportion of unidentified Cercozoa at the order level by RRA was more than 50% for all the samples. Such values of unidentified RRA are typical for studies utilizing exclusively the metabarcoding approach (Ramond et al. 2019; Kashinskaya et al. 2021). It is clear that there is currently not enough information in the databases to describe the diversity of Cercozoa. Some unidentified Cercozoa form a distinct clade on the phylogenetic tree (Fig. 3, red square, Fig. S1). This clade makes up 77 to 94% of the RRA of all unknown Cercozoa in the samples. This unclassified clade may represent a new group of pico-sized Cercozoa that prefers ice habitats.

The taxonomic diversity of the second major revealed domain, Stramenopiles, consisted mainly of the phylum Ochrophyta, which unites many photosynthetic algae and mixotrophic and heterotrophic organisms. Chrysophyceae accounted for more than 60% of reads for each sample at the class level. Unfortunately, chrysophycean OTUs were not assigned to a deeper taxonomic level to assess their trophic structure. All of the Chrysophyceae OTUs remained in the analysis, as they presumably contain many mixotrophic protists, as documented earlier (Porter 1988). Generally, environmental sequencing surveys of stramenopiles reveal uncultured marine eukaryotes called Marine Stramenopiles (MAST) (Massana et al. 2004). They are known as numerous and widespread heterotrophic protists (Massana et al. 2004; Seeleuthner et al. 2018) found in many habitats around the world (Massana et al. 2004; Lin et al. 2012). MAST groups in our study were characterized by low RRA in all samples and constituted less than 0.6% of RRA per sample for all detected MASTs within Stramenopiles. We can assume that such a low abundance of organisms considered dominant in marine plankton could be explained by a combination of two factors: low ice temperature ranging from − 1.5 to − 2.5 °C and the pore size of the filters used. It has been shown that low temperature leads to an increase in the cell size of MAST organisms, e.g., some MASTs had cell sizes above 4 μm at temperatures lower than 0 °C. (Massana et al. 2006). However, the presence of MAST-3 and other larger protists in the metabarcoding data can be caused by the extracellular DNA that can aggregate on the filter, a fact that is described by Sørensen et al. (2013). Additionally, the fractional filtration method we used significantly reduces the proportion of large organisms in the filtrate, but does not completely eliminate them. In filtrates, in addition to pico-sized organisms, organisms of nano- and microplankton are recorded, which is due to the destruction of delicate forms even with the mildest filtration, and the passage of protozoa with flexible cells through the pores of the filter. Thus, the presence of some taxa can be explained by method artifacts and does not reflect the actual composition of the pico fraction. However, this is probably not the case for Pirsonia found in samples 1 and 2. Pirsonia are parasites of diatoms, which often have small and flexible cells, enabling them to pass through filter membranes with a pore size below 3 µm (Kühn et al. 2000). The spring under-ice bloom is characterized by a high abundance of diatoms in the ice bottom. Previous studies have shown that Nitzschia frigida dominates the lower ice layer in Kandlalaksha Bay in spring (Sazhin 2004; Rat’kova et al. 2002). A study conducted in the same locations revealed a number of representatives of diatoms, but their contribution was insignificant (Belevich et al. 2018).

Domain Alveolata was also the core of the heterotrophic picoeukaryote community of the White Sea ice. The noteworthy part of the alveolate diversity is Perkinsea parasites, which were represented by one OTU across two samples. They constituted a small but noticeable part of the Alveolata RRA. Perkinsea include the genus Parvilucifera, comprised of several species that infect dinoflagellates (e.g., Parvilucifera infectans (Norén et al. 1999), P. sinerae (Figueroa et al. 2008), P. prorocentri (Norén et al. 1999), and P. rostrate (Lepelletier et al. 2014)). Thus, the revealed OTU could be a parasite of dinoflagellates (Reñé et al. 2021), which were present in our samples, and dinoflagellates can be common in newly formed sea ice reported by Kauko et al. (2018) and have also been found in our samples. Perkinsea are also parasites of adult molluscs and tadpoles, whose presence in the ice is very unlikely. Alternatively, this Perkinsea presence could be due to the dispersal planktonic stages accidentally caught in ice, given the fact that the studied ice cores are first-year ice. The mechanism of how protists can get into ice and form communities was described previously (Garrison et al. 1983; Petrich and Eicken 2016). However, the formation of first-year ice communities is a challenging issue. Multi-year ice exists for years and contains a more or less permanent community. On the contrary, first-year ice communities are formed from random pools every season. Protists from the surrounding water are accidentally trapped in ice due to the ice-forming process. After trapping, organisms die or survive the conditions. Unfortunately, the OTU defined as Perkinsea was not annotated deeper than class in the PR2 database. Thus, we could not characterize the precise role of this protist. The closest BLAST match of this OTU found in the NCBI database is “Uncultured Perkinsea” from a freshwater reservoir in the UK (accession no. KP122557.1), obtained in the work of Chambouvet et al. (2015) and aimed at investigation of Perkinsea tadpole infections.

The diversity of the White Sea ice Cryptista was represented by Cryptophyta, Telonemia, and Picozoa. Picozoa are tiny heterotrophs (Seenivasan et al. 2013) approximately 2–3 µm in size, although they were important in evolution. Most recent molecular phylogenetic studies have shown that Picozoa are members of the Archaeplastida supergroup (Schön et al. 2021) and are most likely related to red algae and rhodelphids (Gawryluk et al. 2019). This makes Picozoa important in understanding the origin of plastids. Picozoa are usually widespread but not numerous in water ecosystems (Schön et al. 2021); they were poorly represented in the ice of the White Sea, accounting for less than 1% of the total reads (maximum 143 reads in sample 1). Telonemia, like Picozoa, were poorly represented (Table 3) and no less evolutionarily significant. These eukaryovorous protists are a sister lineage to a giant supergroup SAR (Stramenopile, Alveolata, and Rhizaria), which embraces up to half of all eukaryotic species diversity (Strassert et al. 2019). Telonemia are often found in marine metabarcoding surveys and may comprise up to 2% of the total metabarcodes of all unicellular eukaryotes (Tikhonenkov et al. 2022). There are likely dozens of undiscovered genera of Telonemia, which probably occupy the upper levels of microbial trophic networks and play crucial roles in the flow of energy and nutrients (Tikhonenkov et al. 2022).

Regarding the other and less abundant high-ranking taxa revealed in this study, the domain of Opisthokonta was represented only by loricate choanoflagellates. They made up a significant part of the community at station 1 but were extremely rare at stations 2–4 (Table 3). At the same time, choanoflagellates are a quite common component of sea ice communities (Thomsen et al. 1997). One of the minor domains found in sea ice was Haptista, which included OTUs of Centroheliozoa (Table 3). Centrohelid heliozoans were not abundant and were detected only at station 1. It has not been possible to determine taxonomy beyond the phylum for Centroheliozoa. They are relatively large (10–70 μm) heterotrophic protists, but some species are close to 3 μm in size (Zlatogursky 2010) and can probably pass through a 2 μm pre-filter. Alternatively, the presence of centrohelid OTUs could also be explained by extracellular DNA. Domain Apusozoa accounted for less than 1% of the protist abundance and included only members of Apusomonadidae. Apusomonadidae are known as biciliate gliding heterotrophic flagellates, that feed on prokaryotes (Karpov and Mylnikov 1989).

Unidentified taxonomic diversity

Only 11% of OTUs (14 from 121) were classified to the genus level, demonstrating the incompleteness of a sequence database for heterotrophic picoeukaryotes in general and for first-year ice communities in particular. Most molecularly characterized protists are organisms of particular biomedical or biotechnological importance (Keeling et al. 2014), and reference databases based largely on them provide a weak benchmark for research of sea ice communities. Figure 8 demonstrates the ratio of assigned and unassigned taxonomy at different taxonomic levels by the number of OTUs. The diagrams show that more than 95% of the OTUs were classified to the domain and phylum levels and 81% to the class level. However, class-level taxa of protists can contain rather different organisms occupying different ecological niches with different functions in the community. For instance, class Chrysophyceae may include both pure photosynthetics and species that have the ability to switch to heterotrophic nutrition or even pure heterotrophs (e.g., Spumella). Starting with the order, the percentage of assigned OTUs started to drop rapidly (Fig. 8). The number of OTUs with assigned taxonomy at the order level became half as much (39%) as at the class level.

Fig. 8
figure 8

The proportion of OTUs with assigned and unassigned taxonomy at different phylogenetic levels

We should emphasize that the metabarcoding approach does not yet allow to fully assess the microbial community of eukaryotic heterotrophs, especially if the community includes many previously unexplored protists. For instance, metabarcoding can only partly cope with the task of classifying sequences of previously unstudied species. Taxonomic resolution is not only a problem of databases, their development, and curation. Restrictions are also associated with the length of the most commonly used V4 and V9 regions of the 18S rRNA gene, which is usually insufficient to reach deeper taxonomic resolution for heterotrophic protists (Huang et al. 2013; Luo et al. 2011). PacBio long-read sequencing of the full-length 18S rRNA gene or ribosomal operon could partially solve this problem (Jamy et al. 2020). However, the PacBio approach suffers from its limitations as well (Tedersoo et al. 2018).

Classical microscopic investigations of protists and their 18S rRNA sequencing together with proper database curation are needed to reduce the number of unannotated sequences. By itself, identifying organisms with unassigned taxonomy cannot be of much help in elucidating the taxonomic structure and functions of the microbial communities. Our investigation of heterotrophic picoeukaryotes in ice revealed not only poor knowledge of these protists (Keeling et al. 2014) but also highlighted the problem that metabarcoding can be blind without previously developed and curated taxonomy datasets (Hestetun et al. 2020) based on classical microscopy, sequencing in clonal cultures, and single-cell sequencing. Additionally, in typical metabarcoding studies, it is difficult to determine which sequences belong to organisms that have survived and formed stable communities and which come from the DNA of dead cells. To mitigate this problem, an RNA sequencing approach that allows understanding the community composition of living and metabolically active cells can be used, although it also has several limitations, such as the stability of RNA and reference databases problem (Blazewicz et al. 2013).

Conclusion

Metabarcoding high-throughput sequencing of the first-year ice of the White Sea revealed a diverse community of heterotrophic picoeukaryotes, including representatives of 7 eukaryotic domains and 15 phyla. This targeted study of the heterotrophic pico-fraction (< 2μm) of first-year ice was conducted for the first time and has yielded unexpected findings, such as the dominance of Cercozoa. Most of the relative read abundance of Cercozoa belonged to unclassified OTUs, which formed a separate clade on the phylogenetic tree. In total, only 11% of OTUs were classified to the genus level, which demonstrates the incompleteness of sequence databases for heterotrophic picoeukaryotes and the limitations of the metabarcoding approach based on the short V4 region of the 18S rRNA gene. This work expands the understanding of the structure of communities of ice-dwelling heterotrophic picoeukaryotes and shows the importance of studying such communities in a changing Arctic climate.