Introduction

Cryptic diversity in some earthworm species has long been demonstrated using chromosome and molecular genetic analysis (e.g., Omodeo et al. 1952; Graphodatsky et al. 1982; Vsevolodova-Perel 1997; Heethoff et al. 2004; Harper et al. 2006; Lentzsch and Golldack 2006). King et al. (2008) hypothesized that high diversity is the rule in this group: they found several most common cosmopolitan earthworm species to contain two to five highly diverged mitochondrial lineages. This finding was corroborated by plenty of subsequent studies of cosmopolitan earthworm taxa, e.g., Aporrectodea caliginosa species complex (see Pérez-Losada et al. 2009; Fernández et al. 2011, 2012, 2013; Porco et al. 2013), A. icterica (see Torres-Leguizamon et al. 2014), Allolobophora chlorotica (see Dupont et al. 2011; Porco et al. 2013), Lumbricus rubellus (see Donnelly et al. 2013, 2014; Kille et al. 2013; Porco et al. 2013). All these results, however, remain disputable, since DNA sequencing of type specimens is required to be sure about the taxonomic identity of the studied species, as argued by Blakemore et al. (2010). At least in some cases, it was shown that no gene flow occurs in sympatry between these genetic lineages.

Although there are fewer studies on endemic earthworm species, they are also known to possess high genetic diversity, e.g., species of the genus Hormogaster (see Novo et al. 2009, 2010, 2011, 2015), Pontoscolex corethrurus (see Cunha et al. 2014), several Postandrilus species (see Pérez-Losada et al. 2011), as well as some other studies. Some authors suggest that high genetic diversity may be a promising tool for paleogeographic studies (Fernández et al. 2013; Novo et al. 2015).

Although all aforementioned studies were performed on earthworms inhabiting low and moderate latitudes, we hypothesized that species from northern regions might also have deep phylogenetic patterns. During the last glacial maximum (LGM), as well as some previous glaciations, northwestern Europe and northern parts of North America were covered by solid glaciers. The eastern part of Europe and northern Asia was not glaciated because of low precipitation (Hopkins 1976; Glushkova 1984) and contained only local mountain glaciers, i.e., on the Polar and North Ural mountains, the Putorana Plateau, northeastern Taymyr, the Verkhoyansk mountain range, the Suntar-Khayata, and mountains of the Kamchatka Peninsula (Velichko 2002). Although the climate was harsh, certain species were sufficiently adapted to it and could survive its fluctuations in situ.

The only native earthworm in Northern Asia (in addition to several synantropic cosmopolites) is Eisenia nordenskioldi subsp. nordenskioldi (Eisen 1879). This subspecies has the uttermost cold tolerance (Berman and Meshcheryakova 2013; Meshcheryakova and Berman 2014) and is found on the shores of the Arctic ocean and on some Arctic islands, sometimes north of the 70th parallel (Perel 1979; Vsevolodova-Perel and Leyrikh 2014). Its southern distribution spans up to about the 45th parallel; it is also found to the west of the Urals (Blakemore 2016). Within E. n. nordenskioldi, several races with different ploidy, body size, and ecological characteristics are known (Perel 1979; Viktorov 1997; Vsevolodova-Perel and Bulatova 2008). Recent studies demonstrated that this subspecies consists of several highly diverged genetic lineages, which are represented by high-order bootstrap-supported branches on phylogenetic trees (Blakemore 2013; Shekhovtsov et al. 2013, 2014, 2016). These genetic lineages were characterized by high percentage of nucleotide substitutions both within and among them, and were estimated to have diverged several million years ago (Shekhovtsov et al. 2013, 2017).

Our recent study of several E. n. nordenskioldi populations from Northeastern Eurasia suggested that this region harbors a distinct genetic lineage of this subspecies, which we further refer to as lineage 9 (Shekhovtsov et al. 2015). It is only distantly related to other lineages of the species (Shekhovtsov et al. 2015). Different populations of this lineage from high latitudes diverged hundreds of thousand years ago, which implies that they survived several glaciation cycles in situ.

Therefore, we suggested that the phylogeography of E. n. nordenskioldi in northern Asia may shed light on its dispersal history. The aim of this study was to investigate genetic variation of E. n. nordenskioldi populations from the tundra and taiga zones of Asia, and to attempt to reconstruct the history of this subspecies. We collected individuals from 37 populations (Fig. 1; Table 1) and sequenced a fragment of the mitochondrial cox1 gene, which is a high-resolution sequence that is most commonly used in earthworm phylogeography. We focus on the 9th lineage as it was the most numerous in this region and had the highest genetic diversity.

Fig. 1
figure 1

Eisenia nordenskioldi nordenskioldi locations sampled in this study. Circles indicate lineage 9; boxes, lineages 1 and 3. Location numbers refer to Table 1. Dashed lines represent major mountain ranges. The red line indicates the current tundra/taiga boundary

Table 1 Sampled locations of Eisenia nordenskioldi nordenskioldi

Most studies suggest that major rivers serve as phylogeographic borders in northern Asia both for animals and plants, e.g., the Lena, the Kolyma, and the Indigirka (Hope 2011; Eidesen et al. 2013). For earthworms, however, rivers are not natural barriers but dispersal agents. In our sample, we had several population groups confined to river basins or coastal regions (shores of the East Siberian Sea and the Taui bay) (Fig. 1; Table 1). As a null hypothesis, we suggested that populations within a river basin would be genetically more closely related to each other than to those from other basins. Moreover, we expected these differences to be higher when river basins are separated by geographic barriers (mountain ranges). Several mountain ranges are present in the studied region: the Lena basin is separated from the Yana and the Indigirka by the Verkhoyansk ridge and the Suntar-Khayata. The Anadyr river basin (location 32) is separated from the rest of the territory by the ridges of Chukotka, and the western shore of Kamchatka (location 33), by the Eastern and Middle ridges and the Koryak Upland.

We must note that mountain ridges can only be considered relative barriers, as during the Ice ages sea level receded and large areas of shelf were probably habitable for earthworms. However, we hypothesized that mountain ranges should significantly limit gene flow. Moreover, we intended to use genetic data to delimit “ancient” populations that survived several glaciation cycles and “young” ones.

Materials and methods

Earthworm individuals were collected in 2012–2015 (Table 1; Fig. 1). The studied individuals are stored in the collection of the Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia. Total DNA was extracted from several caudal segments using silica columns (BioSilica, Novosibirsk). Amplification of a fragment of the cox1 gene (658 bp without primers) was performed using primers HCO2198 (5′-TAAAC-TTCAG-GGTGA-CCAAAAAATC-A-30; Folmer et al. 1994) and LCO1490 m (5′-TACTC-AACAA-ATCACAAAGA-TATTG-G-3′; Folmer et al. 1994).

A portion of the studied sample was published earlier in Shekhovtsov et al. (2015); those sequences were not initially submitted to Genbank, and were 7 bp shorter as they were amplified using other primers. For this study, we resequenced that sample. Sequences were deposited in GenBank under the accessions KX601279-KX601656. We also used a single sequence (JX531501) from our previous study (Shekhovtsov et al. 2013).

Molecular diversity indices, pairwise nucleotide and amino acid distances and Fst values were calculated using ARLEQUIN v.3.1 (Excoffier et al. 2005). We also used ARLEQUIN to perform mismatch distribution analysis; the obtained tau parameter of demographic expansion was used to estimate divergence time of the studied sample. Precise generation time for E. n. nordenskioldi is unknown. However, it is certain that it takes longer than a year for an earthworm to reach maturity and leave offspring in cold northern soils, so we adopted the minimal estimate of two years.

Bayesian analysis was performed using MRBAYES v.3.2.0 (Ronquist and Huelsenbeck 2003). The GTR + I+G model was suggested by MRMODELTEST (Nylander 2004). The estimate of cox1 mutation rate of 3.5% per million years was taken from Chang and Chen (2005). Two simultaneous independent analyses were run from different random starting trees using four chains of ‘metropolis coupled Monte Carlo simulations’ for 20,000,000 generations, sampling a tree every 1000 generations; 25% of generations were discarded as burn-in. Molecular clock analysis using fossil calibrations was based on the protocol of Marchán et al. (2017). All sequences were translated to amino acids; identical sequences were discarded from the sample; outgroup sequences were taken from GenBank.

Fossil calibrations were set as lognormal distributions so that 95% distribution density fell within the range from the minimum estimate to minimum estimate ×2. Relaxed molecular clock (ucld.mean) was set as a uniform distribution: initial value, 0.035; upper, 0.1; lower, 0.001. MCMC simulations were performed as described above.

Results

Total genetic diversity

We obtained a total of 379 cox1 sequences of E. n. nordenskioldi individuals from 34 locations. Within this sample, 331 belonged to lineage 9 of this subspecies; 41, to lineage 1; and 7, to lineage 3 (Table 1; Fig. 1). One cox1 haplotype (JX531501) was taken from Shekhovtsov et al. (2013). All sequences were 658 bp in length (without primers) and contained no indels. All three detected genetic lineages of E. n. nordenskioldi form strongly diverged bootstrap-supported clades on the cox1 tree (Fig. 2).

Fig. 2
figure 2

Bayesian phylogenetic tree constructed using the cox1 gene. Three Eisenia nordenskioldi nordenskioldi lineages found in our sample (of the total known nine) are shown. While all genetic lineages are by definition monophyletic, relationships between them have not been resolved yet. The 9th lineage represents the majority of our sample and is shown in detail. The most important clades are shown in color (branch names and colors refer to Fig. 1; if not specified, a branch belongs to the Yakutia group). Although detailed relationships within the 9th lineage could not be reconstructed, one can see that certain haplotype groups from the south of Yakutia are located at the base of the tree, and the rest are scattered throughout the tree, while haplotypes from other geographical regions form compact groups and are monophyletic. The tree is rooted with Eisenia uralensis, a congeneric endemic from the Urals. Numbers near branches indicate Bayesian posterior probabilities

Genetic variation was high both within and among lineages. Within lineage 9 alone, we found 186 haplotypes and 187 nucleotide substitutions, 167 of which were parsimony-informative; no mutations leading to stop-codons or indels were detected. Haplotypes within this sample differed by up to 7.6%, with the average distance of 4%. Given this high variation level within a single lineage, it is reasonable to suspect if these haplotypes represent nonfunctional nuclear pseudogenes of mitochondrial origin (NUMTs). However, we can refute this hypothesis as the number of amino acid substitutions was low: 11 per 8 polymorphic positions, four of which led to changes in amino acid class, and none resulted in gene truncation, much less than one would expect under loss of function. Of the eight polymorphic positions, half were singletons and the other half, parsimony-informative. Most individuals differed by only one or two amino acid substitutions. Thus, we can conclude that despite very high nucleotide substitution rate the amino acid sequence of the cox1 gene is under strong purifying selection, and thus the haplotypes obtained by us are functional mitochondrial sequences.

Lineage 1 was detected in seven locations (Table 1) and was represented by three haplotype groups. Individuals from locations 34 and 35 had cox1 sequences very closely related to haplotypes found in various populations from the south of West Siberia (Shekhovtsov et al. 2013). The haplotype from location 34 was identical to one of the individuals from the Tomsk oblast and that from location 35 differed by only five nucleotide substitutions from another one from the same region.

The most numerous group of lineage 1 haplotypes contained individuals from locations 10, 11, 36, and all specimens except one from location 8 had sequences closely related or identical to those of the populations from the Zhuya river (to the east of the Baikal). Another group that included one specimen from location 8, a single one from location 9, and all from location 37 was not closely related to the rest of the lineage 3 haplotypes, differing from them by about 7% of nucleotide substitutions.

E. n. nordenskioldi lineage 3 was found only in location 37, along with lineage 1 (Table 1). There were only three haplotypes and two polymorphic sites. It was most closely related to haplotypes of this lineage from the southern Far East (Primorye krai, Amur oblast and Jewish autonomous oblast; our sequences, unpublished), differing from them by about 6% or 40 substitutions.

The 9th lineage

The southernmost location from Yakutia (location 17) was invariably recovered at the base of 9th lineage (Fig. 2). Other basal branches were formed by haplotypes from locations 11, 12, 14, and 15, also from southern Yakutia. The rest of the tree lacked phylogenetic resolution; however, several groups formed by haplotypes from certain geographic regions were detected. About a quarter of the total sample of the 9th lineage formed a bootstrap-supported clade (shown as the Y1 subgroup in Figs. 1 and 2) that included specimens from many locations from the Anabar, Lena, Yana, and Indigirka basins. Haplotypes from the Taui group (locations 29–31) were rather closely related and formed a separate branch on the tree. The same holds true for specimens from West Siberia (locations 1–3). Another bootstrap-supported branch included populations from the Kolyma basin and the shore of the East Siberian Sea, except for haplotypes from location 28 that were closely related to some individuals from location 20 from the lower reaches of Indigirka (further referred to as the Pevek-Indigirka group). Haplotypes from the only studied population from the Anadyr river basin were not closely related to any other haplotypes and formed three braches on the tree. Two specimens from Kamchatka (location 33) also formed a separate branch.

We calculated pairwise Fst values among the geographic regions given in Table 2. One can see that most of them correspond to high levels of differentiation (over 0.25). Moderate Fst values were detected between the Lena basin and other rivers within the Yakutia group. In addition, the populations from the Anadyr river basin and Kamchatka also had the lowest Fst distance from the Lena river basin. These results could partly be explained by the differences in sample size. However, the Kolyma river basin sample was close in size to the Lena river basin sample but had high Fst distances in comparison to other groups (Table 2).

Table 2 Pairwise Fst values for samples of Eisenia nordenskioldi nordenskioldi from various geographic regions (detailed in Table 1). Values below 0.25 are in bold

Mismatch distribution analysis suggested that the 9th lineage experienced recent demographic expansion (Fig. 3a), and the tau parameter value was high (26.53 (95% CI 20.31–33.18)). Using the tau parameter, we can estimate divergence time for lineage 9 given assumptions about mutation rates and generation times. Under such assumptions, divergence of all haplotypes of the 9th lineages occurred 1,075,000 years ago (95% CI 823,000–1,345,000 years). The analysis performed using the 3.5% divergence rate yielded a similarly old age of this lineage: 1,128,000 years (95% CI 788,000–1,486,000 years). The outgroup fossil calibration method (Marchán et al. 2017) gave much higher estimates 4,039,000 years (95% CI 1,933,000–6,624,000 years). We might guess that these results may be affected by the magnitude of differences between calibration points and the age of estimated groups (hundreds of millions of years vs. hundreds of thousands). Anyway, all estimates suggest that the divergence time of the 9th lineage is much older than LGM events.

Fig. 3
figure 3

Mismatch distribution profiles for a total sample of the 9th lineage Eisenia nordenskioldi nordenskioldi; b location 25; c location 6; d location 32. Observed mismatch distributions are shown by black lines; model distributions, by gray lines

In addition, we performed mismatch distribution analysis for locations that include over 15 individuals. Some of those are shown in Fig. 3b–d. Some locations exhibited the pattern characteristic for recent population expansion (Fig. 3b). Others experienced recent bottleneck events (Fig. 3b), while some contained a mixture of several not very closely related founder haplotype groups (Fig. 3d).

Discussion

High genetic diversity within lineage 9

Molecular genetic studies (Heethoff et al. 2004; King et al. 2008; and countless subsequent papers) suggest that very high intraspecific genetic diversity is the rule for earthworms. While certain results (Blakemore 2010) need to be checked by detailed taxonomic investigations, including studies of taxonomic types restoration of valid synonyms, intraspecific genetic diversity in earthworms is still high in most cases (see Blakemore 2013). Within a species, there are usually several genetic lineages that can be defined as reciprocally monophyletic clusters differing by multiple nucleotide substitutions (approximately 10–20% for mtDNA) with significantly lower diversity within lineages. This situation also holds true for E. nordenskioldi (Shekhovtsov et al. 2013, 2015, 2016, 2017), which is unsurprising considering its huge distribution. According to molecular clock estimates, genetic lineages of this species diverged in Pliocene (Shekhovtsov et al. 2013).

In this study, we focused on genetic diversity of one E. n. nordenskioldi lineage, the 9th lineage that is found in Northeastern Eurasia. We demonstrated that one more layer of diversity exists within this lineage, with several of its branches sometimes found in sympatry. The observed pattern of nucleotide substitutions suggests that all this diversity is represented by functional mitochondrial sequences rather than nuclear NUMTs (see Results).

Two factors contribute to this high diversity. The first is the huge distribution of the 9th lineage. Another reason is that this distribution is patchy. Most soils in Northeast Eurasia are unsuitable for E. nordenskioldi because of acidic pH (3.5–4) caused by low carbonate content in parent rock and low temperature that promotes the accumulation of acid peat. Only certain habitats (aspen grooves on southern slopes, edges of steppe patches, mountain tundra meadows, river valleys) have suitable pH values (over 4.2). Most of these patches, however, do not exceed 10 ha and are separated from other patches by long stretches of unsuitable soil. Thus, most local populations of the 9th lineage are small and experience pronounced impact of genetic drift and founder effect.

Although calibration of a molecular clock for earthworms is problematic, we can still use the obtained values as a rough estimate. Our results suggest that the divergence of all sampled haplotypes of the 9th lineage of E. n. nordenskioldi took place about one million years ago. This fact and high genetic structure in all parts of its distribution suggests that many local populations of lineage 9 could survive several glacial cycles.

We can suggest that southern Yakutia is the center of genetic diversity of the 9th lineage and most probably its ancestral area. First, populations from southern Yakutia occupy the basal position on the trees (Fig. 2), with the southernmost location (location. 17) being the most basal. Second, genetic diversity of other haplotypes from the Lena basin was much higher compared to those from other regions (we should note that this could be partly explained by higher number of sampled individuals and populations). Southern Yakutia obviously had more stable climatic conditions than areas further to the north, so we could hypothesize that local populations were less affected by climatic perturbations. However, more detailed sampling of this area is needed to make strong conclusions.

Populations from other regions formed several branches on phylogenetic trees. Unsurprisingly, three populations of the 9th lineage (locations 1–3) that were found over 2000 km from other locations formed a separate branch on the tree (the Ural branch in Figs. 1 and 2). In the easternmost part of the distribution, distinct groups were isolated from the rest of the sample by mountain ridges: specimens from the coast of the Taui bay (locations 29–31) stand separate on the tree, as well as those from Kamchatka (location 33). Earthworms from the single population from the Anadyr river basin were quite diverse and formed as many as three separate branches that were not closely related to haplotypes from other regions. In all these cases, geographic isolation could explain genetic differences.

In contrast, for Yakutia and western Chukotka the observed genetic patterns do not correspond to geographic features. Populations from the Anabar, Olenek, Lena, Yana, and Indigirka river basins were genetically close to each other, although not forming a monophyletic clade. Certain haplotypes from, e.g., the basins of the Anabar (locations 4 and 5) and Indigirka (location 19) differed only by a few substitutions, suggesting very close genetic relationship. There are no geographic barriers between the Lena valley and the rivers of the North Siberian Lowland (Olenek and Anabar), so genetic similarity between E. n. nordenskioldi populations is not unexpected. However, the Yana and the Indigirka are separated from the Lena by the Verkhoyansk and the Suntar-Khayata ridges that are formidable barriers to earthworm dispersal. Contacts between these river systems could have been possible during glaciations along the Laptev Sea coast when ocean level was lower; however, we would still expect drastically reduced gene flow.

At the same time, populations from the Indigirka and Kolyma basins were not closely related. The latter (locations 22–23) formed a clade with populations from the coast of the Northeastern Sea (locations 24–27). It is somewhat unanticipated as there are no geographic barriers between the Kolyma and the Indigirka, and these rivers are geographically closer to each other than the Indigirka and the Yana.

We should note that in this region one more group of haplotypes was detected (Fig. 1) that included several specimens from location 20 in upper reaches of Indigirka and location 28 from the Chaun bay that are located over 500 km apart. This group was not closely related to any other haplotypes from that region. On the whole, such phylogeographic patterns that are not totally dependent on geography may reflect complex history of E. n. nordenskioldi and certain dispersal or vicariance events that occurred throughout several glaciation cycles.

It is noteworthy that genetic differences appear to be independent of community type: from Fig. 1, it is obvious that there are no genetic differences between populations from tundra and taiga. It indicates that, while soil physicochemical properties are important for E. n. nordenskioldi, they can adapt to different community types.

Lineages 1 and 3

We found three distinct genetic lineages of E. n. nordenskioldi in our sample. While lineage 9 was widespread throughout the whole region studied by us, lineages 1 and 3 had smaller distributions. In the north of West Siberia, lineage 1 was found alongside lineage 9 (Fig. 1). The fact that local haplotypes are closely related or identical to those found in the south of West Siberia clearly indicates that the northern populations of lineage 1 are recent invaders there. The most probable dispersal route is the Ob river, as rivers are pathways of earthworm migrations.

Lineage 1 was also found in six locations in the Lena basin, in one instance alongside lineage 3 (Fig. 1; Table 1). It is noteworthy that lineage 1 is evidently widespread in the south, in upper reaches of Lena and Vilyui, but to the north it was detected only in location 8 (Fig. 1; Table 1). While populations of lineage 1 from the north of West Siberia originated in the south of that region, populations from Yakutia have Far Eastern affinities; denser sampling is needed to clarify this issue.

History of Eisenia n. nordenskioldi in northern biomes

During Pleistocene glaciations, northern parts of Europe and North America were repeatedly covered by ice sheets, which led to complete erasing of all local biota (Hewitt 2000). While some animals could retreat southwards, this was obviously not the case for earthworms. At the same time, only local mountain glaciers existed in northern Asia due to low precipitation. Climatic oscillations still had a high impact on flora and fauna, resulting in transformation of communities from tundra and boreal forest to tundra steppe.

The distribution of lineage 9 once encompassed the whole north of Northern Asia. However, glaciations caused its contraction in West Siberia. When glaciers retreated, earthworm populations began moving northwards from refugia. This is clearly reflected in genetic patterns of locations 1–3: haplotypes of the Ural group are closely related, which is in stark contrast to patterns observed in the eastern part of the distribution. Along with lineage 9, this region was also colonized by lineage 1 of E. n. nordenskioldi (see above).

Deep phylogeographic differences observed among the populations in the eastern part of the distribution suggest that climatic oscillations and the associated community changes were not catastrophic for earthworms. In the northeast of Northern Asia, the 9th lineage seems to be the only genetic lineage of E. n. nordenskioldi. Only in the upper and middle reaches of the Lena and its tributaries we observed lineages 1 and 3 of E. n. nordenskioldi. We also failed to find Dendrobaena octaedra, which is widespread in the north of Europe.

As stated above, ancestral distribution of the 9th lineage was most probably located in southern Yakutia. Colonization of West Siberia and various regions of Northeastern Siberia occurred independently, which is indicated by the lack of close genetic relations among these groups, especially between the Kolyma and the Taui bay and populations from Anadyr and Kamchatka, which we would expect to be closely related if dispersal proceeded stepwise as could be anticipated based on geographic distribution (Fig. 1). Current differences between the Kolyma and Indigirka groups, as well as the presence of an additional group in that region are most probably the result of ancient allopatry.

Ecological characteristics of the 9th lineage of E. n. nordenskioldi imply low population size and, consequently, frequent local extinction and recolonization events. Mismatch distribution analysis of individual populations (Fig. 3b–d) suggests that different populations are indeed on various stages of these processes. Some, e.g., location 25 (Fig. 3b) and location 31 (not shown), have a pattern characteristic for a recent expansion of an initially small and genetically uniform populations. Others, e.g., location 6 (Fig. 3c) and location 22 (not shown), have strong deviations from the model mismatch distribution, indicating bottleneck events. Several populations have several peaks on mismatch distribution profiles [e.g., locations 32 (Fig. 3d) and 14 (not shown)] implying several diverged founder groups.

Conclusions

Earthworms seem rather delicate organisms, vulnerable to cold and dry. However, genetic evidence proved that the representatives of the 9th lineage of E. n. nordenskioldi managed to survive several climatic oscillations and the associated turnover of communities in the northern latitudes of Asia. Although local populations appear to be on different states of evolution, this lineage as a whole appears to be very viable in this extreme and constantly changing environment. Genetic diversity of this lineage is very high and can be used in paleogeographic reconstructions in particular northern regions. It is also reasonable to suggest that other representatives of soil fauna may also possess similarly deep phylogeographic patterns and can thus be promising object for molecular genetic studies.