Introduction

The green turtle, Chelonia mydas, is a marine reptile present in all tropical and subtropical waters across the globe. The species exhibits strong philopatry and natal homing: individuals return to the same reproductive ground over years and females tend to nest on the beach from which they originate [1]. This leads to a complex network of migrations between feeding, reproductive, and nesting grounds, and to a complex genetic structure worldwide with populations breeding separately but occurring at the same feeding grounds [2].

Due to unregulated harvest over the past centuries, green turtle populations are now threatened with extinction according to the IUCN Red List assessment (International Union for Conservation of Nature), with the exception of the Hawaiian and South Atlantic populations which have benefited from conservation actions [3,4,5]. Despite international and local regulations, the species is still exposed to numerous threats including unsustainable harvest, poaching, coastal degradations, and bycatch [3].

It is therefore important to understand and fully characterize the population structure as well as the reproduction dynamics of the species, to facilitate adequate management policies. Mitochondrial DNA has been used to reveal a female-based genetic structure, from oceanic to regional scales [6, 7]. However, this female-inherited marker is unable to provide any insights concerning the gene flow and connectivity driven by the male component of populations or about contemporaneous genetic exchanges. As such, nuclear markers, such as microsatellites, have been used to understand the population structure as a whole. Microsatellites are short repeated DNA motifs of typically 1 to 6 nucleotides that are widely distributed throughout the genome of eukaryotes, whose number of repeats can vary from one individual to another [8]. They are highly polymorphic and exhibit a high mutation rate (10− 4 to 10− 3 mutations per locus per generation). They are co-dominant markers, making them suitable for a wide number of genetic analyses [9].

Discrepancies between the degrees of connectivity inferred from mtDNA and microsatellite markers in green sea turtles suggested male mediated connectivity [10,11,12], or male philopatry [13,14,15]. In addition to determining population structure, microsatellite markers are useful for parentage analysis [9]. In the case of sea turtles, while individual behaviors are still largely unknown, parentage analyses conducted on samples collected from females and hatchlings can reveal information concerning male reproductive behavior and migration. This provides access to important conservation metrics such as the operational sex ratio (OSR). However, more than 15 loci are needed for accurate parentage inference or population assignments [16, 17]. It is therefore important to have a variety of markers available, with different sizes, repeat motives, and annealing temperatures to facilitate PCR multiplexing and allele scoring in order to conduct such analyses. While the development of a species-specific bank of microsatellites is essential for parentage analysis, creating such a bank is extremely costly and time-consuming [9]. This is why testing the transferability of microsatellites to related species is vitally important. The cross-species transferability of microsatellite markers is variable among taxa, but is usually successful among sea turtles [18,19,20].

To date, 36 microsatellite markers have been developed specifically for C. mydas [21,22,23], and another 13 which were developed for the loggerhead, the hawksbill, the olive ridley and the Kemp’s ridley turtles, have been used on C. mydas in population structure and multiple paternity analyses [10, 11, 14, 24,25,26,27]. These studies typically used between 2 and 13 markers. For the loggerhead turtle, Caretta caretta, 42 markers are available [18, 19, 21]; and 39 have been developed for the hawksbill turtle, Eretmochelys imbricata [20, 21, 28, 29]. The aim of the present study was to develop a new set of polymorphic microsatellite markers specific for C. mydas, and to test their amplification on two additional species: the hawksbill turtle, E. imbricata, and the loggerhead turtle, C. caretta.

Materials and methods

Sample collection

Biopsies were performed on 107 adult specimens of Chelonia mydas. Approximately 0.5 cm3 of skin and muscle tissues were collected from the posterior fin of nesting females on Tetiaroa Atoll, French Polynesia, between 2010 and 2021. Samples were stored in 90% ethanol and kept at 4 °C or -20 °C until processing. All the samples were collected by the local NGO Te mana o te moana based in Moorea, French Polynesia with authorizations from the Direction of Environment of French Polynesia (DIREN). For cross species amplification, hawksbill and loggerhead turtle were tested for transferability of microsatellite markers. Seventeen individual hawksbill turtles (E. imbricata) from French Polynesia corresponding to stranding and seized poaching were provided by DIREN, and 16 individual loggerhead turtles (C. caretta) were provided by the Réseau Tortues Marines de Méditerranée Française (RTMMF) stranding network. Loggerhead samples correspond to injured individuals rescued by the RTMMF and strandings from the Mediterranean Sea.

DNA extraction and microsatellite marker design

Total genomic DNA was extracted using the QIAamp 96 DNA QIAcube HT Kit and the QIAcube HT DNA extraction robot (QIAGEN GmbH, Hilden, Germany) following the manufacturer’s protocol. The first step was modified as follows: 1 mm2 of tissue was digested in 200 µL of digestion buffer with 20% Proteinase K (QIAGEN), and left at 56 °C overnight. Total genomic DNA was quantified using an Epoch BioTek spectrophotometer (Agilent, Santa Clara, US) and an equimolar pool of 8 samples of C. mydas (total quantity 3 µg) was sent to GenoScreen (Lille, France) for microsatellite library preparation and sequencing. Samples were sequenced on an Illumina MiSeq platform using a Nano v2 2 × 250 cycles chip. A total of 3361 primer pairs were designed. Among these, 50 pairs were selected and tested based on their repeat number (≥ 6), motif, and PCR product size (> 100 bp). The selected pairs included 23 dinucleotide (DRM), 15 trinucleotide (TRM), and 12 tetranucleotide (TeRM) repeat motifs. For each motif, various ranges of product size were selected in order to minimize overlapping size ranges and facilitate fragment analysis while multiplexing.

Molecular analyses

The 50 selected primer pairs were tested on 107 C. mydas individuals at four annealing temperatures (53°C, 57°C, 60°C, and 63°C). PCR amplifications were performed on 4 DNA samples of C. mydas for each primer pair and temperature. PCR amplifications were performed using Type-it Microsatellite PCR kit (Qiagen) in 11 µL total volume reactions containing 4 µL Type-it Multiplex PCR Master Mix 2X (contains HotStarTaq® Plus DNA Polymerase, Type-it Microsatellite PCR Buffer with 6 mM MgCl2 and dNTPs), 5 µL RNase-free water, 1 µL primers (2 µM forward and reverse primers diluted in 1xTE pH 8 buffer) and 1 µL of DNA template at 10–20 ng/µL. PCR cycles consisted of: 5 min at 95°C, followed by 45 cycles of 30 s at 95°C, 1 min 30 s at the annealing temperature, 30 s at 72°C, and a final extension step of 30 min at 60°C. Amplification success was detected on 1.5% agarose gel and visualized with ethidium bromide. Out of these 50 loci, 39 were successfully amplified for at least one temperature including 21 DRM, 12 TRM, and 6 TeRM, and 25 were multiplexed for further characterization on all samples from each of the three species. For multiplexing, forward primers were labeled with a fluorescent dye on the 5’ end (ATTO565, ATTO550, FAM, YAKYE; Eurofins Genomics, Ebersberg, Germany) (Table 1). PCR products were sent to GenoScreen and allele sizes were assessed using an Applied Biosystems 3730 Sequencer. For accurate sizing, an internal size ladder (GeneScan 500 LIZ, Applied Biosystems) was used. The 25 selected markers were then tested at optimal annealing temperature on 17 E. imbricata individuals and 16 C. caretta individuals for cross-species amplification.

Table 1 Characterization of the 24 microsatellites developed for Chelonia mydas in this study. Microsatellite markers were included in five multiplexes corresponding to the annealing temperature Ta. N, number of samples; Na, number of alleles; Ho, observed heterozygosity; He, expected heterozygosity; HWE, deviation from Hardy-Weinberg equilibrium (p_values shown); FIS, fixation index; LD, linkage disequilibrium; PA, private alleles, Null alleles: Yes/No.

Data analysis

Allele sizes were visually assessed using GENEMAPPER software v.5 (Applied Biosystems) on the 107 C. mydas individuals. Allele size call consistency over all samples was checked twice, and approximately 5% of the total dataset was read by a second person to compare size calls. All ambiguous peak profiles were considered missing data. MICROCHECKER v.2.2.3 [30] was used to identify null alleles, stuttering errors, and large allele dropout. For each locus, allele frequencies, total number of alleles (Na), private allele number (PA), observed and expected heterozygosities (Ho and He), and divergence from Hardy-Weinberg equilibrium were calculated with GenAlEX v.6.503 [31]. The inbreeding coefficient (FIS) and linkage disequilibrium (LD) were calculated using GENETIX v.4.05.2 [32]. LD was calculated between pairs of loci, and the percentage of LD per locus was defined as the percentage of combinations showing significant LD for each locus. To detect potential siblings in the 107 C. mydas individuals from Tetiroa, French Polynesia, the software COLONY v.2.0.6.6 [33] was run 3 times with different starting seeds with a model that allowed for inbreeding and polygamy for both sexes. A full-likelihood analysis method was used with high precision on long runs and no sibship prior. Dyads were considered full-siblings or half-siblings if their probability was greater than 0.95 for the 3 runs.

For E. imbricata and C. caretta samples, allele sizes were assessed with GENEMAPPER software v.5, and the total number of alleles (Na) and number of private alleles (PA) were calculated with GenAlEx v.6.503. In order to explore whether this set of markers was able to detect genetic variance among species, a Principal Coordinates Analysis (PCoA) was computed in GenAlEx and the Nei unbiased genetic distance was calculated between the three species with GenAlEX v.6.503. Samples that had more than 16% missing data were removed from the PCoA analysis and Nei distance calculation.

Results

Genetic diversity

A panel of 25 polymorphic microsatellite loci was successfully developed and showed clear amplification profiles. However, one locus (CMY12) presented stuttering errors and 19% null alleles and was therefore removed for further analysis. The 24 remaining loci were all polymorphic with a number of alleles per locus ranging from 2 to 17, an average of 8 alleles per locus and a total of 191 alleles (Table 1). 16 loci exhibited a dinucleotide repeat motif (DRM), 7 showed a trinucleotide repeat motif (TRM), and 1 contained a tetranucleotide repeat motif (TeRM). DRM loci showed 4 to 17 alleles (mean: 9, total: 146), while TRM loci displayed 2 to 10 alleles (mean: 4, total: 31), and the TeRM locus had 14 alleles.

Table 2 Fullsib and Halfsib dyads of C. mydas samples from Tetiaroa. FS: full-siblings; HS: half-siblings. The probability is calculated as the mean probability of the three runs with Colony software

MICROCHECKER analyses only revealed the likely occurrence of null alleles for locus CMY07 (6.57%), and no evidence of stuttering errors or large allele dropout were detected on any of the 24 loci. Expected heterozygosity (He) ranged from 0.174 to 0.886 (mean: 0.649 ± 0.039), while observed heterozygosity (Ho) was overall slightly lower, ranging from 0.187 to 0.860 (mean: 0.631 ± 0.037) (Table 1). The inbreeding coefficient Fis ranged from − 0.088 to 0.196 and was significantly divergent from zero for 6 loci (CMY07, CMY08, CMY16, CMY19, CMY25) (Table 1). Total Fis was also significant (Fis = 0.034, p-value < 0.001). 10 loci (CMY07, CMY09, CMY10, CMY11, CMY14, CMY15, CMY18, CMY19, CMY26, CMY33) deviated significantly from Hardy-Weinberg equilibrium (p-value < 0.05). P-value was the lowest (< 0.001) for CMY07, CMY18, CMY26, and CMY33. Significant linkage disequilibrium was also identified as 8.3% of the pairwise loci combinations showed significant disequilibrium after sequential Bonferroni correction (Online Resource 1). 8 loci (CMY15, CMY18, CMY20, CMY22, CMY25, CMY27, CMY29, CMY32) were not linked with any of the others. The rest of the loci showed a percentage of linkage disequilibrium ranging from 4% (CMY21) to 22% (CMY17). Using all of the loci, Colony revealed 10 pairs of full-siblings and 2 pairs of half-siblings with a probability greater than 0.95 in the C. mydas samples from Tetiaroa (Table 2), corresponding to 20 different individuals out of the 107.

Cross-species amplification

The 25 loci were tested for amplification on E. imbricata and C. caretta samples (Table 3). CMY12 was included as it showed a clear peak profile despite the detection of null alleles and stuttering errors on C. mydas. All loci were successfully amplified on both species, but rate of success was dependent on both locus and species. For C. caretta, amplification success ranged from 50% (CMY09) to 100% of samples (CMY04, CMY15, CMY17, CMY19, CMY25, CMY35), with an average of 87%. For E. imbricata, amplification success was lower (72%) on average, and ranged from 35% (CMY18) to 100% (CMY26). Allele polymorphism was variable across species and locus, ranging from 1 to 10 alleles per locus (Table 3). It was higher in E. imbricata with 110 alleles in total, compared to 92 for C. caretta. Chelonia mydas presented a large number of private alleles (89) with an average of 3.71 per locus. Eretmochelys imbricata and C. caretta revealed 23 and 25 private alleles, respectively, with an average of 0.92 and 1 private alleles per locus. Five loci were monomorphic for C. caretta, one of which was also monomorphic for E. imbricata (CMY11) (Table 3).

Table 3 Cross-species amplification of the 25 developed loci for two species of marine turtles (Caretta caretta and Eretmochelys imbricata). N/Ntot: number of individuals successfully amplified/number of individuals tested; Na: number of alleles, PA: number of private alleles

For the PCoA analysis and Nei distance calculation, 107 samples of C. mydas, 11 samples of C. caretta, and 9 samples of E. imbricata were conserved as they showed less than 16% missing data. The discarded samples had between 20% and 88% missing data. The PCoA clearly discriminated between the three species, and C. mydas appeared distant from the two others on the first axis, which covered the majority of the variance (Fig. 1). Nei unbiased genetic distance was the greatest for C. mydas and C. caretta (1.575), followed by C. mydas and E. imbricata (1.221). The distance between C. caretta and E. imbricata was only 0.646 (Table 4).

Fig. 1
figure 1

Principal coordinates analysis on 3 marine turtle species analyzed with 24 newly developed microsatellite markers. Variance explained by each axis is shown in brackets

Table 4 Nei unbiased genetic distance between the three species C. mydas, C. caretta, and E. imbricata

Discussion

This study successfully developed 24 microsatellite markers specific to the green sea turtle, Chelonia mydas. A total of 191 alleles were retrieved in the dataset. This set is composed of 16 di- (DRM), 7 tri- (TRM), and 1 tetranucleotide repeat motives (TeRM). These markers add to the 36 markers previously developed for C. mydas that included 5 DRM, 4 TRM, and 27 TeRM [21,22,23]. TRM and TeRM are usually sought after due to the larger number of base pair differences between alleles, which decreases the risk of stuttering errors [9]. TeRM are usually less polymorphic than TRM and DRM [9]. However, in this study and in accordance with Dutton & Frey [22], TeRM showed the highest level of polymorphism (14 alleles per locus), followed by DRM (9 alleles per locus) and TRM (4 alleles per locus). With an average of 8 alleles per locus, the level of polymorphism of these loci is robust enough for structure and parentage analyses of C. mydas populations [16, 34]. It is comparable with the one found by Dutton & Frey [22] (8.33 alleles per locus), and lower than those found by FitzSimmons et al. [21] with 4 DRM (18.5 alleles per locus) and Shamblin et al. [23] with 20 TeRM (12.5 alleles per locus). For other sea turtle species, the level of polymorphism ranges from 5.25 in hawksbill turtles [20] to 11.18 alleles per locus in loggerhead turtles [18].

Levels of observed and expected heterozygosity (Ho and He) fall in the lower range of heterozygosity levels published for C. mydas populations from other regions [10, 11, 14, 22, 24]. They were closer to those found in the Mediterranean populations (Ho: 0.652–0.671 / He: 0.645–0.671) [14]. They were also lower than levels reported in populations of other sea turtle species, such as the loggerhead turtle [18, 35].

Ten out of 24 loci showed a significant departure from the Hardy-Weinberg equilibrium, although the pattern of departure was variable across loci, with 6 loci showing a deficit in heterozygosity and 4 loci exhibiting an excess of heterozygosity (Table 1). Heterozygosity deficiency can be due to selection, population substructure leading to a Wahlund effect, null alleles, or inbreeding [36]. A spatial Wahlund effect is unlikely because all of the samples were collected on females nesting on the same island of French Polynesia, Tetiaroa. As green turtles exhibit strong natal homing, we can reasonably assume that all of these females are from the same population. Samples were pooled across several nesting seasons between 2010 and 2021, and thus a temporal Wahlund effect is plausible. However, a Wahlund effect would affect all loci [37]. Null alleles were detected with MICROCHECKER at only one locus (CMY07). As this locus also showed a departure from the Hardy-Weinberg equilibrium and a significant Fis, it was removed from further analyses on C. mydas populations. Null alleles were not detected on any other locus, however CMY19 also presented significant Fis and departure from the Hardy-Weinberg equilibrium. This can be a sign of a genotyping artefact, such as null alleles [38], which is why we recommend using this locus with caution and performing the analyses with and without it to rule out any bias. Finally, inbreeding is a possible cause of heterozygosity deficiency in this population, as the total Fis is significantly deviant from zero (0.034, p-value < 0.001). The population size is estimated at around 1000 breeding females annually in French Polynesia [39]. Although more recent assessments are needed, the present estimation is coherent with the estimation from annual surveys of Tetiaroa Atoll’s nesting population where 20 to 940 nests were recorded annually between 2008 and 2019 [40, 41]. This small estimated population size, coupled with the philopatric behavior of green turtles on reproductive and nesting sites, can lead to inbreeding [42]. The sibship analysis among our samples revealed that 18% of the samples (20 specimens) are engaged in a relationship with at least one full- or half-sibling, thus confirming that significant Fis values may, in part, be explained by inbreeding and family structure of this French Polynesian population.

On the other hand, heterozygosity excess is generally associated with missing data or genotyping errors [43]. It can also be due to associative overdominance, if the neutral microsatellite locus is linked with a locus under selection favoring heterozygosity [44]. Missing data were present in CMY11 (6%) and CMY18 (14%). CMY33 and CMY26 might present genotyping errors, and similar to CMY19, we recommend using these loci with caution.

Levels of linkage disequilibrium (LD) are variable across species and populations, and are the result of many forces such as selection, genetic drift, mutation, gene conversion, epistasis, and recombination [45]. This population of C. mydas had 8.3% of the pairwise loci combinations showing significant LD, which is moderate compared to other species [45, 46]. Other species of sea turtles however, showed no LD with microsatellite markers in many of their populations [18, 19].

Sibship analysis revealed that most adult female full-siblings were sampled either with a 4-year gap or in the same year, and one dyad was sampled 8 years apart (Table 2). This provides a first indication of the reproduction frequency of green turtle females on Tetiaroa, which seems to be around 4 years. This is consistent with general knowledge on the green sea turtle that defines the reproductive frequency between 2 and 4 years [47]. Although this conclusion is preliminary, a more in-depth parentage analysis on a larger sample size that includes hatchlings will help reveal the reproductive behavior in both males and females. This demonstrates that this set of markers is promising for parentage analysis, which, coupled with field data, will give precious insights into individual behavior.

Cross-species amplification was successful on the two species that were tested, C. caretta and E. imbricata, with 87% and 72% amplification success, respectively. Cross-species amplification is commonly used for microsatellite analysis in all the sea turtle species with a high rate of success [18,19,20,21,22]. Other reptilian taxa are also known to show a high rate of cross-species amplification success [48], while other marine taxa such as fishes and bivalves show a low rate of cross-species amplification success [38]. The success of cross-species amplification indicates that the flanking regions of microsatellite loci are conserved across sea turtle species, in line with the findings of FitzSimmons et al. [21]. This new set of loci could thus be useful for further studies on other sea turtle species, such as C. caretta and E. imbricata, but is also likely to amplify successfully on the closely related olive ridley (Lepidochelys olivacea) and Kemp’s ridley (L. kempii) turtles.

Furthermore, all loci revealed one or several private alleles for at least one of the species. The largest number of private alleles was found for C. mydas, due to the specific development of this microsatellite marker set for this species. The occurrence of private alleles, although it can be artificially increased by a low sample size, shows that this set of markers can be used to distinguish between the three sea turtle species.

Additionally, the PCoA showed a clear genetic differentiation between the three species. The green turtle appears to be more distant from the two other species, which is confirmed with the Nei unbiased genetic distance. This is in accordance with the phylogenetic distance between the species [49]. The hawksbill and the loggerhead turtles are more closely related and belong to the Carettini tribe, with a split between the species about 29 million years ago. In contrast, their common ancestor with C. mydas, which belongs to the Chelonini tribe, is distant from 63 million years.

In conclusion, 23 of the microsatellite loci developed here can be used to assess the genetic variability of Chelonia mydas populations and two other species of sea turtle. Most importantly, this marker set can help to unravel the behavior of both males and females by providing a high number of loci and alleles which are required for robust parentage analyses [17, 34]. This aspect of sea turtle biology, which to date is largely unknown, is critical for the conservation of populations, as increasing global temperatures are already driving a massive feminization within the world’s largest green turtle rookery [50].