Introduction

Oxytropis (Fabaceae) is a taxonomically complex genus that comprises above 300 species (Zhu et al. 2010), of which 55 occur in the Russian Far East (Pavlova 1989). Among these, there are many rare and endemic species with restricted ranges. Oxytropis chankaensis Jurtz. (synonym O. hailarensis subsp. chankaensis (Jurtzev) Kitag.) is an endangered perennial herb with a narrow range that is restricted, in Russia, to the west shore of Khanka Lake (Kharkevich and Kachura 1981; Pavlova 1989; Yurtsev 1964), the largest lake in Northeast Asia (Fig. 1a). On the Chinese side of Khanka Lake, the only known occurrence of O. chankaensis is on a natural sandy spit lying between Khanka Lake and small Xiaoxingkai Lake (I.V. Maslova, the Researcher of the Khankaisky Nature Reserve, pers. com.). O. chankaensis is listed as “vulnerable” in the Rare Plant Species Book of the Far East of Russia (Kharkevich and Kachura 1981) and in the Red Data Book of the Primorsky Kray (2008). This legume was originally found on the shoreline of Khanka Lake and was first described as the distinct endemic species, Oxytropis chankaensis, based on definite morphological differences from its congeners (Yurtsev 1964). Later, the status of this species was reduced to a subspecies of O. hailarensis Kitag. (Kitagawa 1979), the species that occurs in China and Mongolia (Bisby et al. 2009; Czerepanov 1995). In the Flora of China, O. hailarensis is regarded as O. oxyphylla (Pall) DC (Zhu et al. 2010). However, the species from the west shore of Khanka Lake is clearly distinguishable from a related species, O. oxyphylla, not only by morphological features (Yurtsev 1964), but also by chromosome number: the latter species is diploid (2n = 16, Agapova et al. 1990; Zhu et al. 2010), while O. chankaensis is tetraploid (2n = 32, Gurzenkov and Pavlova 1984; Probatova et al. 2008). When we surveyed all of the present-day localities of O. chankaensis, we did not find any diploid or triploid plants. Hence, following Russian botanists who previously studied native plants of this species in the wild (Barkalov and Kharkevich 1996; Pavlova 1989; Probatova et al. 2008; Red Data Book Primorsky Kray 2008), we regard the Khanka Lake populations as a separate species, Oxytropis chankaensis Jurtz. belonging to subsect. Oxyphylliformes Jurtz. sect. Baicalia Steller ex Bunge subgenus Oxytropis.

Fig. 1
figure 1

a Map of the southern Russian Far East showing the location of Khanka Lake. b Geographic location of the known localities of Oxytropis chankaensis on the west Russian shore of Khanka Lake (according to Kharkevich and Kachura 1981) and map of the frequencies of the seven chloroplast DNA (cpDNA) haplotypes observed in five populations of O. chankaensis. Filled circles with a code denote the sampling localities (for population codes, see Table 1). Open circles denote localities where O. chankaensis plants were found previously but are not found at present. Pie charts represent the proportion of haplotypes in each population. c Single most parsimonious tree from analysis of the chloroplast noncoding regions. The numbers above branches indicate bootstrap values based on 1,000 replicates. d The statistical parsimony (95%) network of cpDNA haplotypes constructed using TCS program. The size of circles corresponds to the frequency of each haplotype. Small open circle represents the inferred intermediate haplotype not detected in this study. Each line between haplotypes represents a mutational step. The H7 haplotype was identified by TCS as the ancestral haplotype

O. chankaensis is an outcrossing species that is pollinated by bumblebees, with pollen potentially being dispersed over long distances. Plants have high fecundity with fruits containing up to twenty seeds, and an individual plant produces approximately four thousand seeds (Kholina et al. 2003). Mature spherical pods can be dispersed by wind and water over long distances beyond the limits of local populations, while some seeds from the dehiscent pods are gravity-dispersed to only a short distance from the maternal plant to form the soil seed bank. O. chankaensis is a long-lived species with overlapping generations. O. chankaensis plants occur only in sandy habitats on a narrow strip along the west Khanka Lake coast, forming separate populations numbering from approximately 80 to 500 individuals (Kholina and Kholin 2006). Khanka Lake has always experienced variations in its shoreline, and the water level of Khanka Lake not only showed significant fluctuations during transgressions and regressions of previous epochs (Korotkii et al. 2007), but regular oscillations in the water level are also observed at present. Such oscillations in the water level and increasing anthropogenic effects may be the cause of the severe fluctuations in individual O. chankaensis population sizes that have been observed. In recent field surveys (2001–2005), we have not found this species in all localities where its presence was previously documented (Fig. 1b) and, in the localities where it is currently found, population sizes may vary from year to year from approximately twenty to several hundred plants (Kholina et al. 2009).

Knowledge of the population genetic structure of narrowly occurring endemic plants is of great importance for the purpose of conservation of the existing populations (Ellstrand and Elam 1993). Such information also provides an opportunity to gain a better understanding into how evolutionary processes such as speciation can act over small spatial scales (Prentice et al. 2003). Various factors (geographical range, taxonomic status, life form, breeding system, and dispersal capabilities) can influence the degree and distribution of genetic variation and gene flow within the range of a species (Hamrick and Godt 1996; Nybom 2004). Different historical processes also have influenced the spatial genetic structure of plant species. Assessing the contribution of each of these mechanisms to current population structure can prove difficult. The analysis of variation in plant genomes with different modes of inheritance can, however, provide valuable information on genetic structure of current populations and on evolutionary forces that have shaped the current distribution of genetic variation (Pleines et al. 2009 and references therein).

In earlier studies, we analysed intraspecific variation in O. chankaensis based on morphological features and on allozyme and random amplified polymorphic DNA (RAPD) markers that are assumed to represent the nuclear genome. Based on the results of canonical, discriminate and cluster analyses, distinct morphological differences between the populations were found (Kholina and Kholin 2008). High levels of genetic polymorphism and statistically significant differences among populations were also revealed by using both RAPD and allozyme markers (Artyukova et al. 2004; Kholina et al. 2007, 2009). In addition, allozyme analysis allowed us to identify southern and northern groups of populations that differed in the levels of allozyme diversity (Kholina et al. 2009). The tetrasomic segregation of the allozymes that we observed in O. chankaensis implies that this species is an autotetraploid (Kholina et al. 2004). The high levels of genetic and genotypic diversity that we found in this species based on allozyme data implied that polyploidy might have resulted from recurrent crosses between genetically different plants. Polyploidy is believed to be a common mechanism in the evolution of plants, and multiple origins have been shown for both auto- and allopolyploid plants including Oxytropis species (Jorgensen et al. 2003; Segraves et al. 1999; Soltis and Soltis 2000; Soltis et al. 2003, 2007; Tremetsberger et al. 2009). The polytopic and recurrent origin of many polyploids may account for the overall complexities that are observed in such taxa (Gauthier et al. 1997; Parisod et al. 2010; Soltis and Soltis 2009).

The combined use of nuclear and chloroplast DNA (cpDNA) data allows for the determination of the origin and evolutionary history of polyploids (Guo et al. 2006; Kao 2008; Soltis et al. 2003). CpDNA is predominantly transmitted through the seeds in most angiosperms including the Fabaceae (Doyle et al. 2004; Gauthier et al. 1997) and usually exhibits geographically structured variation (Korpelainen 2004; Petit et al. 2005). In recent decades, cpDNA markers have been widely used in investigations of genetic structure, phylogeography, and the reconstruction of the evolutionary history of endemic and endangered species (e.g., Artyukova et al. 2009; Ayele et al. 2009; Ikeda et al. 2008; Prentice et al. 2003; Wang et al. 2009). In the present study, the genetic structure of O. chankaensis was determined using sequence polymorphisms in four noncoding cpDNA regions. We addressed the following questions: (1) what level of plastid genome variability exists in this narrow-range endemic species; (2) how is genetic variation distributed within and among populations; and (3) what forces were involved in shaping the population structure.

Materials and methods

Plant materials

We studied individuals of O. chankaensis in the wild at localities where it had been reported in published data (Kharkevich and Kachura 1981; Pavlova 1989) and sampled the entire natural range of the species in Russia. Plants of O. chankaensis were only found in five out of eight previously documented localities (Fig. 1b). Three localities are located near the villages Turii Rog, Novokachalinsk, and Troitskoe, and two localities are situated in the territory of the Khankaiskii Nature Reserve. Specimens that are representative of the different ecotypes of O. chankaensis are held at the Herbarium of the Institute of Biology and Soil Science, Vladivostok (VLA). At the time of sampling, the estimated population sizes varied from 77 to 512 with an average of 49.8 adult generative plants in population. In each population, leaves from randomly selected adult generative plants approximately 200 m apart were collected in a way that was not damaging to the plants sampled. Sample size, population code, and the geographic coordinates for each population are given in Table 1.

Table 1 Sampling site locations, codes, sample sizes and distribution of haplotypes in Oxytropis chankaensis populations

DNA amplification and sequencing

Total genomic DNA was extracted as described in Artyukova et al. (2004). To investigate cpDNA variation, we tested six noncoding intergenic spacer regions of the chloroplast genome: the trnD GUCtrnT GGU, the trnH GUGpsbA, the petGtrnP, the rpoBtrnC, the trnL intron–trnLF intergenic spacer (trnLF), and the trnS GCUtrnG UUC (trnSG) that have been found to be polymorphic in some plant species (Desplanqe et al. 2000; Huang et al. 2002; Shaw et al. 2005). To amplify these regions, we used previously published primer pairs (Table 2) and thermocycling conditions (Shaw et al. 2005). We failed to amplify the trnD GUCtrnT GGU region, though all of the other cpDNA fragments were successfully amplified for the 63 plants tested. Sequencing of the PCR-amplified products was carried out in both directions under the sequencing conditions described by Shaw et al. (2005) using a BigDye terminator v. 3.1 sequencing standard kit (Applied Biosystems) with the same primer pairs that were used for amplification. In addition, internal primers were used for sequencing the trnSG and the trnLF regions (Table 2). Although amplification of the rpoBtrnC region was successful for all individuals tested, sequencing reactions with the primers that were used for amplification failed repeatedly; this region was therefore excluded from subsequent analysis. Sequences were analysed on an ABI 3130 genetic analyzer (Applied Biosystems, USA). Forward and reverse sequences were assembled using the Staden Package v. 1.4 (Bonfield et al. 1995) and aligned manually with the SeaView program (Galtier et al. 1996). DNA fragments that contained substitutions and/or microsatellite variants were retested (reamplified and resequenced) to verify that our results were repeatable.

Table 2 Primers, fragment sizes and GenBank accession numbers for sequences of the four chloroplast regions investigated in this study

Data analysis

Because cpDNA does not recombine and is, therefore, equivalent to a single locus, sequences for the four fragments that we investigated were combined to derive the haplotype of each individual. Two repeats (a dinucleotide AT-motif within the trnL UAA intron and a mononucleotide poly-T motif within the petGtrnP intergenic spacer) varied in length. These repeats were included in the data set because repeatability tests allowed us to exclude PCR errors. These repeat variations were treated as point mutations, interpreting each increase or decrease of a single repeated unit as a single mutational event (Simmons and Ochoterena 2000). Maximum parsimony analysis (with gaps coded as a fifth base or as characters in a separate presence/absence matrix) was performed in PAUP* v. 4.0b10 (Swofford 2003). To represent all possible alternative pathways between haplotypes within a single figure, we carried out a statistical parsimony analysis with a 95% confidence limit for parsimony using the TCS program and coding indels as a fifth state (Clement et al. 2000).

Most population genetic analyses were performed using Arlequin v. 3.11 (Excoffier et al. 2005) and DnaSP v. 4.5 (Rozas et al. 2003) software. We calculated numbers of haplotypes (nH) and values of haplotype diversity (h) and nucleotide diversity (π). To detect departures from the standard neutral model of evolution, we performed Tajima’s D (Tajima 1989) and Fu’s F S (Fu 1997) tests using Arlequin and Fu and Li’s (1993) D* and F* tests using DnaSP. These tests show different degrees of sensitivity to deviation from neutrality caused by demography or selection. Significance levels of these tests were assessed by generating 10,000 random samples and using model-based simulations (Excoffier et al. 2005). Positive values indicate that haplotype classes are evenly represented, whereas negative values result from the presence of a rare haplotype. Significant positive values of Tajima’s D, Fu and Li’s D* and F* tests and insignificant values of Fu’s F S test would suggest that either balancing selection or a reduction in population size had occurred, while significant values for only Fu’s F S suggest that population growth is occurring (Fu 1997). Nonsignificant test results can also be informative by comparing whether the values of several tests, together, are uniformly positive or negative (Fu 1997). D and F S tests tend to be negative if there is an excess of rare variants, which is an indication of genetic hitchhiking/selective sweep or population growth. An excess of common variants is expected to be caused by population subdivision, population size reduction, or balancing selection, such that the resulting values of these tests tend to be positive (Fu 1997; Fu and Li 1993; Tajima 1989; Wright and Gaut 2005). To test for population demographic changes, a mismatch pairwise distribution analysis (MDA) that linked the number of differences between haplotypes and haplotype frequency was also performed using Arlequin. We assessed the fit of the observed mismatch distributions to a model by coalescent simulation of 10,000 samples using the sum of squared deviations (SSD) between the observed and expected mismatch distributions and the raggedness index (r) as test statistics. The 95% confidence intervals of the demographic parameters were estimated with 10,000 replicates. Unimodal patterns with low and insignificant values of SSD and r are typical for expanding populations, while the distribution is multimodal in populations of constant size (Rogers and Harpending 1992). Population stability was also inferred if the 95% confidence intervals for two parameters, scaled mutation rates before (θ 0) and after (θ 1 ) growth, overlapped, even if the P value of the SSD was not significant (Schneider and Excoffier 1999).

An analysis of molecular variance (AMOVA; implemented in Arlequin) was performed to estimate the distribution of genetic variation within and between populations and the values of pairwise genetic distances (F ST) between populations. The significance of the variance components was determined with a permutation test (10,000 replicates). Thresholds of significance for pairwise F ST values were estimated using the Bonferroni correction for multiple tests. To calculate and compare the two differentiation indices N ST and G ST, we used Permut v. 2.0 software (available from http://www.pierroton.inra.fr/genetics/labo/Software) with a permutation test (10,000 permutations). G ST makes use of only haplotype frequencies, whereas N ST takes into account the similarities of haplotypes. A higher value for N ST than G ST would be indicative of phylogeographic structure (Petit et al. 2005; Pons and Petit 1996). We also calculated genetic differentiation standardised to the maximum level that could be obtained for the observed amount of genetic variation (GST), as proposed by Hedrick (2005). To test for a correlation between geographic and genetic distances, we performed a Mantel test using Arlequin software with the matrices of genetic differentiation defined as F ST or as the ratio of F ST/(1 – F ST) (Rousset 1997) and testing for significance with a permutation procedure (1,000 replicates). The ratio of gene flow via pollen and seeds was estimated following Ennos (1994) as r = {[(1/G STb−1) × (1 + F IS)]−2(1/G STm−1)}/(1/G STm−1), where G STm and G STb are the estimates of subdivision at maternally inherited markers and at nuclear (allozyme) markers, respectively; and F IS is the heterozygote deficit based on the previous allozyme data (Kholina et al. 2009).

Results

PCR products of the trnHpsbA, petGtrnP, trnSG, and trnLF regions were successfully amplified and sequenced for all O. chankaensis individuals tested. A total of 2,798 bp of aligned chloroplast sequences were obtained, comprising 420, 524, 1,158, and 696 bp for the trnHpsbA, petGtrnP, trnStrnG, and trnLF regions, respectively. A low number of polymorphic sites was detected: all individuals had identical sequences for the trnHpsbA region, while a single individual differed for a number of repetitive units of an AT-motif within the trnLF region, and four variable characters were found within two other regions: one site in the trnSG region and three sites in the petGtrnP (Table 3). Among the individuals we studied, seven haplotypes were recognised as combinations of the variable sites, of which three were informative for parsimony. Sequences of four cpDNA regions for each haplotype were deposited in EMBL/GenBank (see Table 2 for accession numbers). Maximum parsimony analysis yielded a single MP tree with a topology that was irrespective of the indel coding mode used and which was characterised by three clades with low bootstrap values (Fig. 1c). Low levels of diversity at the population level usually lead to a lack of phylogenetic resolution, and networks are considered the most appropriate way to represent all possible relationships within a species (Schaal et al. 2003). The network that we produced by the statistical parsimony method (Fig. 1d) had topological congruence with the MP tree and represented haplotype relationships with a single change in one base pair between any two haplotypes, except for H6, which differed from H4 in one repetitive unit of a dinucleotide repeat. Four haplotypes were frequently sampled (>5%) and the most frequent haplotype, H2, was found in 41.3% of the individuals in our samples, while the other three haplotypes (H1, H3, and H4) were found in 30.2, 15.9, and 7.9% of our specimens, respectively. Three haplotypes (H5, H6, and H7) were found only in single individuals, and H7 represented an intermediate haplotype between the two most frequent haplotypes, H2 and H1. The H2 haplotype was found in individuals from all localities, and the other frequent haplotypes also occurred in more than one population: haplotypes H1 and H3 were found in four populations, and the H4 haplotype was restricted to the populations TR and NK (Table 1; Fig. 1b).

Table 3 Description of the chloroplast haplotypes detected in Oxytropis chankaensis populations

Genetic diversity estimates at population and species levels are summarised in Table 4. Within all populations, the levels of nucleotide diversity were similarly low, and haplotype diversity was high. At the species level, the values of π and h were 0.00052 and 0.7179, respectively. Among populations, genetic differentiation was low (G ST = 0.037), and standardisation based on the maximum level that could be obtained for the observed amount of genetic variation increased the estimate of population differentiation only to a very small degree (GST = 0.146). The index of population structure, N ST, which additionally considers the similarities between haplotype sequences, was 0.099, and the difference between G ST and N ST was not significant (P > 0.05), indicating a lack of phylogeographic structure (Petit et al. 2005; Pons and Petit 1996). AMOVA demonstrated that the majority of molecular variation was found within populations, and less than 10% of the total genetic variance was among the populations (Φ ST = 0.09, P = 0.029; Table 5). When the samples were divided into two groups according to their location in the southern (PS, SI, and KR) or northern (TR and NK) regions of the study area, the majority of molecular variance was still observed among individuals within populations (87.23%, P = 0.027; Table 5). The small amounts of genetic variation that were attributable to differences between regions (9.81% of the total) and among populations within groups (2.95% of the total), were not significant (P = 0.10). Most values of F ST between pairs of populations were not significantly different from zero (P > 0.1; Table 6), and only the population NK was differentiated from all other populations (P < 0.05). However, after Bonferroni correction for multiple testing, the F ST-value between NK and TR was no longer significant, and a comparison between NK and KR became marginally significant (P = 0.006). The Mantel tests showed that there was no significant effect of isolation by distance because of the lack of correlation between the matrices of geographic distances and F ST or linearised F ST (r = 0.166, P = 0.225 and r = 0.178, P = 0.216, respectively). Only 2.75% of the genetic distance was explained by geographical distances between the populations. The ratio of gene flow via pollen and seeds was calculated to be 4.56, which is indicative of seed dispersal appears to comprise a main component of the total gene flow in this species.

Table 4 Estimates of genetic diversity in Oxytropis chankaensis populations based on cpDNA sequence data
Table 5 Results from the analyses of molecular variance (AMOVA) of cpDNA sequence data for Oxytropis chankaensis populations
Table 6 Pairwise F ST values between Oxytropis chankaensis populations

Putative mechanisms that causing the observed pattern of cpDNA polymorphism could be inferred from the results of neutrality tests and MDA, though these inferences should be regarded with caution because different processes can produce the same patterns (Fu 1997; Rogers and Harpending 1992). Most tests of neutrality that were calculated for the entire data set or for the populations were statistically insignificant (P > 0.10; Table 7), indicating that populations are in mutation–drift equilibrium. However, the values of all neutrality tests except for the NK populations tended to be positive, suggesting the presence of haplotypes with intermediate frequencies. In contrast, the fact that all of the tests for population NK had negative values resulted from the presence of rare haplotypes in this population. The shapes of mismatch distributions were bimodal for most populations and the entire data set that indicated the presence of two frequent haplotypes (Fig. 2). A bimodal distribution was shown to be consistent with the distribution pattern that would be expected in populations whose size had either stayed constant or contracted (Harpending et al. 1998; Slatkin and Hudson 1991). However, the SSD and r-index values were insignificant in all cases (Table 7), which did not allow us to reject an expansion model completely. It should be noted that, for populations KR and SI, the 95% confidence intervals for θ 0 and θ 1 overlapped, which was inconsistent with the hypothesis of population growth (Schneider and Excoffier 1999). For other populations, and for the entire data set, the 95% confidence intervals for θ 0 and θ 1 did not overlap, implying that these populations could have undergone an expansion in the past. Only for population TR (Fig. 2) was the shape of mismatch distributions unimodal with the SSD and r-index values fitting an expansion model.

Table 7 Test statistics and mismatch distribution results based on cpDNA sequence data for Oxytropis chankaensis populations
Fig. 2
figure 2

Mismatch distribution of pairwise nucleotide differences in Oxytropis chankaensis. The lines with filled squares show the observed distributions of pairwise nucleotide differences between haplotypes both within populations and for the entire data set. Lines with small dots represent the expected distributions fitted to the data under a model of population expansion. For population codes, see Table 1

Discussion

O. chankaensis is an endemic species with a very narrow geographic range and high habitat specificity. Small, isolated populations of such endemic species are prone to genetic drift and inbreeding, which can erode genetic diversity, reduce fitness, and threaten the long-term survival of populations, even in the absence of habitat destruction (Ellstrand and Elam 1993). Previously, we found higher levels of genetic diversity based on allozyme (He = 0.301, Kholina et al. 2009) and RAPD (He = 0.290, unpublished data) markers in O. chankaensis, compared with the average values for endemic plants (He, alloz = 0.076, Godt et al. 1996 and He, RAPD = 0.20, Nybom 2004). In the present study, based on cpDNA, we found low nucleotide diversity. Our finding of few polymorphisms within a total of 2,798 bp DNA region is consistent with the low mutation rate in the chloroplast genome that has been estimated for genus Oxytropis of 8.9 × 10−10 substitutions per site per year (Wojeiechowski 2005). Likewise, low nucleotide diversity in cpDNA has been found in other endemic species (e.g., Petunia exserta,π = 0.0007, Lorenz-Lemke et al. 2006; Hymenaea stigonocarpa, π = 0–0.0027, Ramos et al. 2007; Aconitum gimnandrum,π = 0–0.0050, Wang et al. 2009), and there is no cpDNA variation at all in nine populations of Heptacodium miconioides (Lu et al. 2006).

The most striking feature of O. chankaensis that was revealed in our investigation is its unexpectedly high cpDNA haplotype diversity (h = 0.7179) for a species that is confined to a very restricted zone (less than 0.5 km2). High levels of total haplotype diversity have been observed in some widespread (e.g., Acacia acuminata, h = 0.9196, Byrne et al. 2002) and endemic species with wider areas of distribution (e.g., Phyllodoce nipponica, h = 0.852, Ikeda and Setoguchi 2007; Petunia exserta, h = 0.657, Lorenz-Lemke et al. 2006; Hymenaea stigonocarpa, h = 0.804, Ramos et al. 2007; Aconitum gimnandrum, h = 0.739; Wang et al. 2009). In most angiosperms, populations often are fixed for single cpDNA haplotypes, and polymorphic populations possessing different haplotypes occur in potential contact zones of the different maternal lines or at sites of long-term persistence. In contrast, no populations of O. chankaensis that we sampled were fixed for a single haplotype, and all populations displayed high levels of haplotype diversity, sharing two or three common haplotypes (Fig. 1b). The uniform levels of diversity that we detected across the range of O. chankaensis might be due to these populations originating from a once continuous ancestral population, and the occurrence of several cpDNA haplotypes in populations could be explained by polymorphisms that were present in the putative ancestor. According to coalescence theory (Posada and Crandall 2001), the H7 haplotype may be a more ancient haplotype because it is found in a central position in the network (Fig. 1d). However, the most frequent haplotype H2 could also be considered an old haplotype because it exhibits three connections to other haplotypes, and it is found at a high frequency in our study populations. The occurrence of both these haplotypes in population PS agrees with our previous results based on allozyme data, suggesting that this population could be, putatively, the centre of the species formation (Kholina et al. 2009).

The presence of different cpDNA haplotypes within the extremely narrow geographic range of O. chankaensis could also be explained by the recurrent polyploidy events in the evolutionary history of this species. If tetraploid O. chankaensis originated once, then we would expect little if any cpDNA diversity. Multiple origins of polyploidy could increase amounts of variation in a species, also by adding different maternal lineages (Parisod et al. 2010; Soltis and Soltis 2009), and the number of different cpDNA haplotypes in a species indicates the maximum possible number of origins of polyploidy (Segraves et al. 1999). Our results allowed us to assume that the current populations of O. chankaensis have originated from at least three polyploidization events. During the cold and dry periods of the Holocene, when forest-steppe occupied the southern part of the Russian Far East, the steppe vegetation, which could have included a putative diploid progenitor of O. chankaensis, was more widespread throughout the Khanka plain than now (Bazarova et al. 2008; Mokhova et al. 2009). The new tetraploid species could have arisen by several polyploidy events within a genetically diverse diploid parent population of steppe species. The ancestral forms might have been lost to extinction during subsequent climatic changes and recurrent transgressions and regressions of Khanka Lake, while newly formed polyploids survived in a narrow coastal zone due to greater genetic flexibility of polyploids (Rausch and Morgan 2005) and higher productivity. In some groups of plants, including Oxytropis species (Jorgensen et al. 2003), the presence of mixed cytotypes within populations or different cytotypes at adjacent territories is relatively common. However, no diploid or triploid plants of relative congeners that could be progenitors of tetraploid O. chankaensis occur in close proximity to the current range of this species. Some diploid (2n = 16) species of sect. Baicalia occur in China, Mongolia and Korea (Zhu et al. 2010), e.g., O. oxyphylla, O. lanata, O. myriophylla, O. pumila, and O. ochrantha. In the sight of Jurtzev (1964), O. koreana (which is now regarded as a synonym of O. racemosa, Bisby et al. 2009; Zhu et al. 2010) is the closest extant relative of O. chankaensis. However, it is difficult to identify which of these species could be a putative ancestor of O. chankaensis or these closely relative species of sect. Baicalia were descended from a common progenitor.

Fluctuations in population size due to lake-level oscillations can result in a loss of genetic diversity due to population bottlenecks. However, severe or long-lasting population bottlenecks are considered likely to result in the extinction of some haplotypes, though all of the haplotypes we found in O. chankaensis are contiguous to one another across the entire genetic network (Fig. 1d). The overall pattern of the genetic variation in noncoding regions of cpDNA is in agreement with the expectations of the neutral equilibrium model of evolution, though selection cannot be excluded entirely. However, it is unlikely that selection would strongly affect noncoding regions of cpDNA. The results of neutrality tests and MDA for population NK are consistent with a loss of intermediate haplotypes and a recent increase in a small number of surviving haplotypes (Fig. 2; Table 7). For most other populations, these results are consistent with a recent contraction or population stability, though expansion cannot be ruled out completely because the signs of past expansions could have been erased due to repetitive habitat size fluctuations. The smallest MDA age expansion parameter (τ, Table 7) for population NK and its distance from population TR imply that population NK could be recolonising from the population TR, which is the only population containing both haplotypes that are frequent in NK and where the traces of recent population expansion were detected using MDA (Fig. 2). The occurrence of unique haplotypes in the population NK might have resulted from new mutations or rare seed dispersal events from the Chinese part of species area where some O. chankaensis plants have been observed on a natural sandy spit lying between Khanka Lake and Xiaoxingkai Lake (I.V. Maslova, the Researcher of the Khankaisky Nature Reserve, pers. com.). The long genesis of Khanka Lake (presumably from the middle Miocene) is thought to have gone through a series of transgressions and regressions including nearly complete exsiccation and draining of the surrounding plains (Korotkii et al. 2007). At present Khanka Lake also has no constant shoreline, and the demography of O. chankaensis is characterised by potential instability through periodic habitat reduction and expansion events. Additionally, populations of O. chankaensis have become more fragmented due to anthropogenic influences in the region, which have increased during the last decades.

The variation of haplotype frequency among populations and the low, mostly insignificant, pairwise F ST values between them (Table 6) might be a consequence of genetic drift occurring after splitting of a formerly more continuous population. Spatial genetic patterning in cpDNA was determined to be very weak, and the standardised differentiation estimate for the plastid genome (GST = 0.146) is much lower than the average value that has been found for angiosperms (G ST = 0.637, Petit et al. 2005). In the majority of angiosperms, including endemic and polyploid species, plastid DNA is generally more highly structured compared with the nuclear genome (Kao 2008; Korpelainen 2004; Petit et al. 2005). In contrast, the low level of cpDNA subdivision across O. chankaensis populations (G ST = 0.037; Φ ST = 0.090) is comparable with the low levels of differentiation for the nuclear genome that were found previously from allozyme and RAPD markers (Φ ST-RAPD = 0.13, Kholina et al. 2007; G STalloz = 0.028, Kholina et al. 2009). Mantel test results for the plastid genome point to a lack of isolation by distance that agrees with the results based on allozyme and RAPD data (unpublished data). Low population partitioning (cohesive genetic system) and the lack of phylogeographic patterning (as judged from N ST/G ST comparison) may be attributed to both recent fragmentations and extensive gene flow. The ratio of gene flow via pollen and seeds that we estimated is low (4.56) as compared with the median value of 17 that was determined for 93 comparison species (Petit et al. 2005). This ratio is consistent with a hypothesis of interpopulation gene flow predominantly due to seed dispersal, which, in O. chankaensis, occurs mainly through dispersal of mature pods by wind and water, though random dispersal of seeds between populations by humans also cannot be ruled out.

Thus, the genetic structure that we observed in O. chankaensis can apparently be explained due to both biological traits and historical factors. Although our data do not allow us to make a conclusion about the origin of O. chankaensis with confidence, based on independent lines of evidence (data from noncoding region sequences of cpDNA, RAPDs, and allozymes; Artyukova et al. 2004; Kholina et al. 2004, 2007, 2009), the most likely hypothesis is that this species originated from a putative diploid progenitor by recurrent polyploidy events. The life history of O. chankaensis includes frequent extinction, recolonisation, and expansion due to repetitive fluctuations in the Khanka Lake water level. In spite of the capacity for rapid population growth through vigorous seed production (Kholina et al. 2003), the extinction of at least three local populations (Fig. 1b) occurring over the past 30–50 years is an obvious indication of the ongoing decline of O. chankaensis populations in number and in population size. The ultimate goals of conservation are to ensure the continuous survival of populations and to maintain their evolutionary potential. Our data show that the full complement of cpDNA haplotypes can be detected from three populations (PS, NK, and TR). However, the allozyme diversity data (Kholina et al. 2009) indicate that ensuring the presence of the full complement of alleles requires the inclusion of all populations. Given the potential instability of demography in this endemic species and its limited habitat preferences, we suggest conserving all five of the remnant populations, and the most suitable strategy for the conservation of O. chankaensis is the protection of its natural habitats.