Introduction

Lentil (Lens culinaris Medikus) is one of the oldest domesticated grain legumes. It is an annual, self-pollinated diploid (2n = 14) cool-season legume crop with a haploid genome size of 4063 Mbp (Arumuganathan and Earle 1991). Lentil is an important source of dietary protein (22–35%) in both the human nutrition and in animal feed, and it also provides rotational benefits for management of weeds, diseases, and pests, and in many cases offers a profitable, high value crop option for farmers (Hamwieh et al. 2005; Phan et al. 2007). Lentil (Lens culinaris Medik. ssp. culinaris) is cultivated throughout Europe, Western Asia, the Middle East, North Africa, the Indian subcontinent, North America, and Australia. The archeological records place lentil domestication from its wild progenitor Lens culinaris spp. orientalis in Syria and Turkey approximately 8500 BC (Cubero 1981). Worldwide lentil production in 2016 was 6.3 million metric tons from an area of 5.48 million ha, the top producers being Canada, India, Turkey, and the United States of America (FAO 2013).

Molecular markers have been used by lentil breeders and geneticists in genetic analysis of lentil (Kumar et al. 2014). Genetic diversity assessment of lentil have been carried out using AFLP, RFLP, RAPD, ISSR, SSR markers (Havey and Muehlbauer 1989; Sharma et al. 1996; Ferguson et al. 1998; Sonnante and Pignone 2001; Toklu et al. 2009; El-Nahas et al. 2011; Alghamdi et al. 2014; Dikshit et al. 2015; Idrissi et al. 2015; Tsanakas et al. 2018). Lentil genetic maps have also made it possible to have a better understanding of lentil genome (Eujayl et al. 1998; Kahraman et al. 2014; Tanyolac et al. 2010; Saha et al. 2013; Ates et al. 2018). However the lack of available molecular markers limits genetic and genomic analysis of lentils as compared to other legumes, limited availability of molecular tools also hinders breeding programs to be carried out to improve lentil cultivars. Therefore, in order to enable breeders to produce varieties with high yield and better quality, efficient molecular tools, like markers, should be developed and used in further breeding programs in lentils.

Among the different types of DNA markers, SSR markers are considered as an important tool for studying genetic diversity, population structure, phylogenetic relationships, construction of frame-work linkage maps, QTL interval mapping, map-based cloning of genes, marker-assisted selection (MAS), etc., thereby aiding in genetic improvement of crop plants (Hendre et al. 2007). SSRs have several genetic advantages, such as high degree of polymorphism, multi-allelic nature, reproducibility, co-dominant inheritance, locus specific, relative abundance, and good genome coverage (Powell et al. 1996).

The first genomic library formed to develop SSR markers in lentils was performed in ILL5588 cultivar by using SauIII restriction enzyme (Hamwieh et al. 2009). It was reported that 371 (0.18%) of 200,000 clones screened through GT, GA, GC, GAA, TA, TAA repeats contained microsatellites and 243 (65.4%) of them were sequenceable. Of these 243 sequenced clones, 173 (71.2%) contained SSR motifs. Verma et al. (2014) used Precoz lentil cultivar and sequenced 514 clones from genomic libraries enriched with GA/CT repeats and reported that 375 (72.9%) them contained three or more SSR motifs. Andeden et al. (2015) worked with Karacadag lentil genotype and inquired 432 clones from genomic libraries enriched with CA, GA, AAC, and ATG repeats and encountered SSR motifs in 360 (83.3%) clones.

So far, the number of available polymorphic genomic SSR markers for lentils is only 244 (Hamwieh et al. 2005, 2009; Verma et al. 2014; Andeden et al. 2015). Therefore, the objective of the present study was to develop a new set of SSR markers from the microsatellite-enriched genomic library of lentils and to determine polymorphism rate of these markers for the analysis of genetic diversity in Turkish lentil genotypes. This new genomic resource of SSR markers would provide significant contributions in molecular breeding of lentils.

Material and Methods

Plant Material and DNA Extraction

A total of 23 lentil cultivars (Firat-87, Tigris, Seyran-96, Cagil, Altıntoprak, Yerli Kırmızı, Kafkas, Ankara Yesili, Bozok, Ceren, Gumrah, Karagul, Yusufhan, Ali Dayı, Ciftci, Meyveci-2001, Emre-20, Sultani-1, Sazak-91, Kayi-91, Ozbek, Malazgirt-89, Erzurum-89) developed in different research centers of Turkey and black lentil cultivar Beluga were used for amplification of developed SSR markers and to determine polymorphism ratios. Genomic DNA was isolated from fresh, young leaves of all accessions according to the protocol described by Lefort et al. (1998) with minor modifications. The quality and quantity of the extracted DNA were determined in NanoDrop® ND-1000 Spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA) and agarose gel electrophoresis (0.8%).

Construction of a Genomic Library Enriched for the AG and AC Microsatellite

A genomic library of L. culinaris cv. Kafkas, enriched for the AG and AC motif was constructed using a modified protocol of Techen et al. (2010). For this purpose, the biotinylated (AG)12 and (AC)12 oligoprobe and magnetic beads coated with streptavidin were used following the hybridization-based capture technique. Briefly, nuclear DNA of lentil (cv. Kafkas) was restricted with a combination of RsaI+AluI+HaeIII (NEB) in the same reaction. Genomic DNA fragments were A-tailed and ligated to specific adaptors [blunt end primers SSRLIBF3 (5′-CGGGAGAGCAAGGAAGGAGT-3′) and SSRLIBR3 (5′-Phos CTCCTTCCTTGCTCTCCCGAAAA-3′)]. Adapter-specific primers SSRLIBF3 were used to amplify the adaptor-ligated DNA fragments. The amplified products were hybridized with the biotinylated microsatellite oligo AG and AC in the same reactions at 50 °C (depending on the Tm of the oligo) for 4 h. Streptavidin-coated magnetic beads (Invitrogen, Dynabead M-280) were used to capture DNA fragments hybridized with the AG and AC-rich biotinylated fragments according to the manufacturer’s instructions. After binding, the beads were washed first with 2XSSC, then with 0.5XSSC both at room temperature and finally with 0.5XSSC at 50 °C for 5 min. Elution of the single strand DNA from the biotinylated oligos was done twice with 60 µl of MQ water at 96 °C for 10 min. and amplified with the SSRLIBF3 primer. The PCR products were cloned into T–A vector TOPO4 (Invitrogen, Carlsbad, CA, USA) and transformed into TOP10 cells (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s instructions. The recombinant colonies were picked and prepared in glycerol stocks. Colony PCR was done to identify colonies containing microsatellites with M13 Forward and Reverse primers and microsatellite oligo AG and AC primers (Bloor et al. 2001). The PCR products were analyzed in 2% agarose gel electrophoresis. The PCR products with two or more bands indicated that the plasmid contains a microsatellite-containing insert.

Analysis of Microsatellite-Containing Sequences and Primer design

Positive plasmids were amplified using the TempliPhi Amplification kit (GE Healthcare, Wauwatosa, WI, USA) and sequenced using BigDye Terminator v3.1 Cycle Sequencing Kit in the Applied Biosystems Prism 3500 Genetic Analyzer System (Applied Biosystems, Foster City, CA, USA). The vector sequence was removed by using the Vecscreen (https://www.ncbi.nlm.nih.gov/tools/vecscreen/) program. To remove redundancy, CAP3 program (https://doua.prabi.fr/software/cap3) (Huang and Madan 1999) was used and the location of microsatellite repeats was determined with SSRIT (https://archive.gramene.org/db/markers/ssrtool) program (Temnykh et al. 2001). Duplicated sequences were identified with BioEdit (Hall 1999) program and they were removed. At least 5 primers were designed for dinucleotide repeats, at least 3 primers were designed for trinucleotide repeats and more primers were designed for the sequences containing the greater number of repeats by using Primer3 (https://bioinfo.ut.ee/primer3-0.4.0/) (Koressaar and Remm 2007; Untergasser et al. 2012) and BatchPrimer3 (https://probes.pw.usda.gov/batchprimer3/) (You et al. 2008) programs.

PCR Amplification of Microsatellites and Genetic Diversity Analysis

In order to validate the markers developed in this study, PCR amplifications (15 µl) were performed with 90 ng genomic DNA, 10 µM of each primer, 0.2 mM of each dNTPs, 1X DreamTaq Green Buffer (includes MgCl2 at a concentration of 2 mM) (Thermo Scientific, Waltham, MA, USA), and 0.5U DreamTaq DNA Polymerase (Thermo Scientific, Waltham, MA, USA). All PCR reactions were performed in the Bio-Rad thermocycler. The amplification program consisted of an initial step of 3 min at 94 °C, followed by 35 cycles of 1 min at 94 °C, 1 min at 50–66 °C, 2 min at 72 °C, and a final extension at 72 °C for 10 min. The amplified products were analyzed through 3% agarose gel electrophoresis.

Genetic diversity analysis was performed by using M13-tailed primer according to the methods described by Schuelke (2000) in 24 lentil genotypes. A tail (M13 universal sequence (− 21), TGTAAAACGACGGCCAGT) was added to the 5′ end of each forward primers. PCR amplifications were performed in 15 µl reaction mixture containing 90 ng genomic DNA, 0.1 µM of each SSR primer, 0.1 µM labeled M13 (− 21) universal primer, 0.2 mM of each dNTPs, 1X DreamTaq Green Buffer (includes MgCl2 at a concentration of 2 mM) (Thermo Scientific, Waltham, MA, USA) and 0.5U DreamTaq DNA Polymerase (Thermo Scientific, Waltham, MA, USA). The amplification program consisted of an initial step of 3 min at 94 °C, followed by 35 cycles of 1 min at 94 °C, 1 min at 50–66 °C, 2 min at 72 °C, followed by 8 cycles of 1 min at 94 °C, 1 min at 53°C, 2 min at 72 °C, and a final extension at 72 °C for 10 min. The M13 (− 21) primer was 5′-fluorescently tagged with HEX, 6-FAM or ROX to facilitate multiplexing. A set of three PCR products (0.5 µl each) was mixed with 0.5 µl GeneScan-600 LIZ size standards (Applied Biosystems, Foster City, CA, USA) and 9.5 µl Hi-Di™ formamide (Applied Biosystems, Foster City, CA, USA) and denatured at 95 °C for 5 min, chilled on ice and electrophoresed on the Applied Biosystems Prism 3500 Genetic Analyzer System (Applied Biosystems, Foster City, CA, USA). GENEMAPPER software v5.0 (Applied Biosystems, Foster City, CA, USA) was used to determine fragment size.

Data Analysis

Microsatellite diversity analyses were carried out at the locus level in a given species. For each locus, the expected heterozygosity (He), observed heterozygosity (Ho) and polymorphism information content (PIC) (Nei 1973) were calculated with PowerMarker V3.025 software (Liu and Muse 2005). The UPGMA (unweighted pair-group method using arithmetic average) were used to construct and draw a dendrogram from the genetic similarity matrix by using the MEGA6 (Tamura et al. 2007) and PowerMarker software programs. Bootstrap analyses with 100 replicates were performed and a consensus tree was obtained to measure the confidence levels for the clusters.

Results

Isolation and Characterization of Microsatellites

A total of 350 clones were inquired for Kafkas lentil cultivar from the libraries enriched by using the AG and AC repeat motifs within the same reaction. These clones were screened with colony PCR reaction containing AG and AC repeat motifs and 68 of them contained repeat regions. When these 68 clones with repeat sections were analyzed through sequencing, it was observed that 53 of them were identified as appropriate for primer design. In clones without a primer design, number of repeat motifs was identified as 4 and less for dinucleotide repeats. The 53 sequences with primer design contained a total of 134 SSR motifs (Table 1). Among the identified SSR motifs, GA/CT motif was the most frequent one (62.6%). The other motifs were identified as AG/TC, GT/CA, AC/TG, CTT/GAA, AGA/TCT (respectively with 23.8%, 6.8%, 4.5%, 1.6%, 0.7%). Microsatellite repeats mostly contained dinucleotide repeats, slightly contained trinucleotide repeats and generally located within imperfect repeats. Tetranucleotide repeats for primer design were not encountered. Of the 53 SSR primers developed, 71.6% (38 pairs) contained imperfect repeats, 26.4% (14 pairs) contained perfect dinucleotide repeats and 1.8% (1 pair) contained compound repeats. The number of repeat motifs at the perfect AG/AC loci ranged from a minimum repeat length of 5 (Lc_MCu9, Lc_MCu20, Lc_MCu27) to a maximum repeat length of 24 (Lc_MCu33) (Table 1). The duplicated sequences (5 of them) were removed.

Table 1 Newly developed lentil SSR primers and information about these primers

Microsatellite Polymorphisms and Genetic Diversity Analyses

All of the developed SSR markers were initially PCR-tested and optimized in Fırat-87, Tigris and Seyran-96 lentil cultivars. In PCR reactions, while Lc_MCu8, Lc_MCu13a, Lc_MCu16b, Lc_MCu29a, Lc_MCu30, Lc_MCu41a, Lc_MCu43, and Lc_MCu44 primers yielded non-specific bands, amplification was not achieved in Lc_MCu37 primer. The remaining 44 SSR primers were analyzed in Applied Biosystems Prism 3500 Genetic Analyzer System (Applied Biosystems, Foster City, CA, USA) and 31 (70.4%) SSR markers were identified as polymorphic and 12 SSR markers (29.6%) were identified as monomorphic for tested cultivars.

The 31 polymorphic SSR markers had 144 alleles in 24 cultivars and the number of alleles per locus varied between 2 and 15 with an average value of 4.64 (Table 2). The Lc_MCu33 primer had the greatest number of alleles (15 alleles). This primer was followed by the primers Lc_MCu19, Lc_MCu24, and Lc_MCu47 respectively with 10, 9, and 9 alleles. Expected heterozygosity ratios of 31 polymorphic SSR markers varied between 0.218 (Lc_MCu31) and 0.903 (Lc_MCu33) with an average value of 0.588 and observed heterozygosity ratios varied between 1.000 (Lc_MCu3, Lc_MCu4, Lc_MCu7, Lc_MCu28, Lc_MCu42) and 0.000 (Lc_MCu50) with an average value of 0.506. Polymorphic information content (PIC) values varied between 0.194 (Lc_MCu31) and 0.895 (Lc_MCu33) with an average value of 0.520.

Table 2 Genetic parameters for SSR primers, number of alleles (n), expected heterozygosity (He) and observed heterozygosity (Ho), PIC (polymorphism information content)

The dendrogram created with 31 polymorphic SSR markers in 24 cultivars had two different groups (Fig. 1). The first group is composed of Altintoprak, Emre-20, Tigris, Fırat-87, Cagil and Seyran-96 cultivars and the remaining cultivars constituted the second group which was divided into sub-groups. The greatest genetic similarity (91%) was observed between the cultivars Emre-20 and Tigris. The other cultivars with high genetic similarity were identified as Seyran-96 and Cagıl (89%), Bozok and Karagul (88%) cultivars.

Figure 1
figure 1

The UPGMA based genetic relationship dendrogram for registered Turkish lentil cultivars formed with the SSR primers developed in this study

Discussion

In the present study, about 350 clones were inquired in Kafkas lentil cultivar from the genomic libraries enriched with AG and AC repeats and 68 (19.4%) of these PCR-screened clones contained SSR motifs. These 68 clones were sequenced and it was observed that 53 (79.1%) sequences contained 134 SSR motifs for primer design. Of the developed SSR markers, 31 (58.4%) were polymorphic. The percentage of SSR motif-containing clones of the present study was 106 times greater than the value reported by Hamwieh et al. (2009), but 3.8 and 4.3 times lower than the values reported respectively by Verma et al. (2014) and Andeden et al. (2015) who used enriched method. In present study, 58.4% (31 pairs) of the developed markers were polymorphic. Resultant polymorphic marker percentage was 84% (122 pairs) lower than the value reported by Verma et al. (2014), but 32% (56 pairs) greater than the value reported by Hamwieh et al. (2009) and 23.5% (71 pairs) greater than the value reported by Andeden et al. (2015). Such differences resulted from the methodological approach used in creation of the libraries, selection of restriction enzymes, types of SSR motifs selected for enrichment or rarity of selected SSR motifs in relevant plant genome (Cuc et al. 2008).

The genetic relationship dendrogram created with 24 registered lentil cultivars used to identify polymorphism ratios of the primers revealed that the developed primers were able to separate all lentil cultivars efficiently. Andeden et al. (2015) also used 8 of present cultivars to test the primers developed and reported similar distribution of these 8 cultivars within the genetic relationship dendrogram.

Greater dinucleotide repeats were reported in lentil genome than the trinucleotide repeats (Hamwieh et al. (2009). Similarly, in the present study, almost all of the SSR primers were composed of dinucleotide repeats and trinucleotide repeats were mostly located in imperfect primer groups. Such a case could be related to the method used. With the method developed by Techen et al. (2010), mostly dinucleotide repeats could be isolated.

Hamwieh et al. (2005) tested SSR primers in Lens culinaris sub-species (L. culinaris subsp. culinaris, L. culinaris subsp. orientalis, L. culinaris subsp. tomentosus, L. culinaris subsp. odemensis) and reported the total number of alleles as 182 with 13 alleles per locus. Total number of alleles for L. culinaris subsp. culinaris was reported as 128 and number of alleles per locus varied between 2 and 16 with an average value of 9.14. Verma et al. (2014) tested 33 primer pairs in 46 genotypes (Lens culinaris sub-species and 8 different legumes) and reported the total number of alleles as 123 and number of alleles per locus as between 2 and 5 with an average value of 3.73. PIC values were reported as between 0.13 and 0.99 with an average value of 0.66. Andeden et al. (2015) tested 78 polymorphic markers in 15 genotypes and reported the total number of alleles as 400 and number of alleles per locus as between 2 and 11 with an average value of 5.1. PIC values were reported as between 0.07 and 0.89 with an average value of 0.58. Present findings on average number of alleles per locus were greater than the values of Verma et al. (2014), similar with the values of Andeden et al. (2015) and lower than the values of Hamwieh et al. (2005). Such differences mostly resulted from differences in number of genotypes and diversity of these genotypes. PIC values of the previous studies and the present study were close to each other.

Up to now, 244 SSR markers were developed for lentils by using genomic libraries (Hamwieh et al. 2005, 2009; Verma et al. 2014; Andeden et al. 2015). With this study, 31 additional new polymorphic SSR markers were developed and the previous number of available SSR markers was raised to 275. These newly developed SSR markers will constitute useful tools for molecular breeding, mapping, assessments of genetic diversity and population structure of lentils.