Introduction

Germplasm fingerprinting gives knowledge of diversity and methods for efficient protection and conservation in genetic resources (Jarvis et al. 2000). Simple sequence repeat (SSR) markers are highly polymorphic, more reproducible and distributed throughout the genome (Joshi and Behera 2006). They are highly variable in the number of repeats in several nucleotide sequences and co-dominantly inherited (Johansson et al. 1992). They are also flexible for uses from the simple gel analysis to large automated detection system. Thus, SSRs are used in the genetic analyses of self-pollinating crops (Jin et al. 2008; Udupa et al. 1999) like legumes.

Lentil (Lens culinaris Medik.) is a self-pollinating diploid (2n = 14), annual cool season legume crop, that is produced throughout the world and highly valued as a source of food protein. Different types of molecular markers were used to determine allelic diversity of lentil collection (Datta et al. 2007). Pioneering works of Abo-elwafa et al. (1995) and Ford et al. (1997) gave important information for germplasm diversity, but also showed that non-specific PCR based markers could not provide repeatable results in differentiating lentil genotypes. Hamwieh et al. (2005) develop ca. 80 sequence specific SSR markers in lentil and ca. 30 of them have already been assigned on the genetic map. These mapped SSRs provide a standard tool for fingerprinting lentil germplasm and markers for genetic analysis and breeding.

The objectives of the present study were to estimate the genetic diversity within Central Asia and Caucasian (CAC) lentil germplasm collection which has not been evaluated with an appropriate DNA fingerprinting system. The relationship among the germplasm in this area is discussed using the similarity data from the marker analysis.

Materials and methods

Plant material

A total of 39 lentil accessions of CAC origin obtained from International Centre for Agriculture in Dry Areas (ICARDA) genebank were analyzed. Two of these were improved cultivars and 35 were landraces, whereas categories of two accessions were unknown. According to the classification of Barulina (1930), seed type of 34 accessions was microsperma (L. culinaris Medik. ssp. microsperma (Baumg.) Barulina) (smaller seed) and five was macrosperma (L. culinaris Medik. ssp. macrosperma (Baumg.) Barulina) (larger seed). The list of lentil accessions and their passport information is presented in Table 1.

Table 1 Accession number, country of origin, collection site, category and seed types of lentil accessions used in this study

Marker analysis

A leaf sample of 500 mg from each accession was ground using liquid nitrogen and DNA was extracted according to CTAB method (Torres et al. 1993) and prepared for PCR amplification. Five SSR markers designed by Hamwieh et al. (2005) were used. These markers mapped genetically as SSR 167 (linkage group (LG)_1), SSR 33 (LG_3), SSR 323 (LG_5) and SSR 156 (LG_7), respectively. Their primer sequences were: SSR 33 (F: CAAGCATGACGCCTATGAAG and R: CTTTCACTCACTCAACTCTC); SSR 156 (F: GTACATTGAACAGCATCATC and R: CAAATGGGCATGAAAGGAG); SSR 167 (F: CACATATGAAGATTGGTCAC and R: CATTTATGTCTCACACACAC); SSR 199 (F: GTGTGCATGGTGTGTG and R: CCATCCCCCTCTATC) and SSR 323 (F: AGTGACAACAAAATGTGAGT and R: GTACCTAGTTTCATCATTG). PCR amplifications were performed in a 2720 thermal cycler (Applied Biosystems) with conditions of 3 min at 94°C, followed by 35 cycles of 15 s at 94°C and 15 s (with specific annealing temperature of the primer pair), 30 s at 72°C. PCR products were separated on 8% polyacrylamide gel and their fragment sizes were compared to 50 and 100-bp markers.

Data analysis

Fragments amplified by primer sets were scored for presence or absence which coded as 1 or 0, respectively. A binary matrix was then transformed to genetic similarity (GS) matrix using Dice similarity coefficient (Dice 1945; Nei and Li 1979). GS between pairs of accessions were measured as:

$$ {\text{GS}} = \frac{2a}{{\left( {2a + b + c} \right)}} $$

where a is number of shared fragments, b is number of fragments in line A, and c is number of fragments in line B. Cluster analysis was performed based on unweighted pair group method with arithmetic mean method (UPGMA) using SAHN of software package NTSYS-pc (Rohlf 2000). Gene diversity (heterozygosity) was calculated according to Weir (1990):

$$ {\text{Gene diversity}} = 1 - \sum\nolimits_{ij}^{n} {p_{i}^{2} } $$

where P ij is the frequency of the jth pattern for SSR marker i and is summed across n patterns.

Results and discussion

Among the 39 accessions from seven countries representing CAC, a total of 33 bands were identified by 5 SSR markers. The number of alleles for each primer sets ranged from 3 (SSR 199) to 8 (SSR 156, 323). These numbers were high compared to those (3–5 by 10 accessions) in Inder et al. (2008), indicating that present lentil germplasm from CAC region had broader diversity.

The observed allelic frequencies ranged from 0.01 to 0.87, with an average of 0.147. Seventeen alleles (52%) appeared with the frequencies of 0.10 or lower, while only one allele was present in most of lentil accessions with frequency of higher than 0.8. GD (Weir 1990) per locus ranged 0.24 (SSR119) to 0.89 (SSR 33) with the average 0.66. Analysis of genetic diversity within geographical regions found higher diversity index in accessions from Azerbaijan (0.60) compared to Tajikistan (0.56), Uzbekistan (0.52) and Armenia (0.51).

Pair-wise similarity GS among 39 accessions ranged from 0.30 to 1.0. To visualize genetic relationships among 39 lentil accessions, a dendrogram was constructed based on the UPGMA cluster analysis (Fig. 1). There were six major groups generated at 0.5 similarity coefficients. One landrace from Azerbaijan (R30: cluster VI) and two landraces from Tajikistan (R15: cluster V and R17: cluster VI) showed a clear differentiation from the other accessions. Accession specific alleles of each R30 and R15 in SSR 156 support their unique genotype.

Fig. 1
figure 1

A dendrogram showing the genetic similarity of 39 Central Asia and Caucasian origin lentil genotypes based on unweighted pair group method with arithmetic mean method using NTSYS software package

The dendrogram analysis (Fig. 1) revealed that no one cluster (except cluster IB) could be exclusively ascribed to any group a priori defined on the basis of its geographical origin. However, numerous tendencies clearly appeared.

Two main sub-groups were formed (A and B) in the cluster I. Subgroup IA contained five accessions of very diverse origins and without clear tendency. Two of Tajik, one of Uzbek, one of Azeri accessions and the only one studied from Kazakhstan were found in this group. Diversity in the cluster is possibly because of germplasm exchange among breeding centers which contribute to a clustering of accessions from different geographic origins that share common parents. It would be also interesting to note that two Tajik accessions were the closest inside subcluster IA with GS = 0.95.

The larger sub-cluster (IB) constituted a very homogenous group of 15 accessions: 14 from Tajikistan and 1 from the Azerbaijan. Of these, two (R8 and R18) and four (R20, R24, R35 and R36) accessions showed identical genotype in the present SSR markers. One part of accessions in subcluster IB were collected from same region Kulyab with collection site E 69°46′ and N 37°54′ in 1956 and thus have similar genetic backgrounds in the dendrogram. Because of the lack of information on the collection site of other accessions from Tajikistan, it was not possible to reach further conclusions regarding groupings in the dendrogam., anyway this observation might indicate that they share a common origin. The accession from Azerbaijan—R29 with macrosperma seed type, had same SSR 323 pattern with its nearest neighbor in IB subgroup—R23, which led to the grouping of this genotype along with the Tajik accessions. The confused relationships among them could be due to the small number of loci being used.

Cluster II divided into two sub-groups (IIA and IIB) consisted of accessions representing almost all countries of origin in this study. Genotypes under IIA cluster showed a higher degree of similarity (0.66–0.95). All genotypes from Uzbekistan, except R5, and all genotypes from Armenia, except R4 were in this cluster. Both of Uzbek accessions were coming from Bukhara (E 64°25′, N 39°46′), whereas the collection site of R5 was unknown. On the other hand, IIB was formed by only two accessions from Turkmenistan (R13) and Armenia (R4). Another accession from Turkmenistan (R3) also formed an isolated group in cluster III, indicating that accessions from Turkmenistan might have different pedigrees from germplasm in other countries. The remaining sub-cluster III contains accessions from Azerbaijan (R6) and Tajikistan (R9). Considering the exotic germplasm in cluster IV, V and VI, there are some unique germplasm in Azerbaijan and Tajikistan, even many of the accessions from Tajikistan showed high similarities.

In summary, in the present study, three important results may be underlined. First, high linkage between genetic relations among accessions and their geographical regions of origin was found for accessions from Tajikistan, Uzbekistan and Armenia. Cluster IB was mainly composed of accessions from Tajikistan with one exception. Two out of three landraces from Uzbekistan fall into IIA subcluster, whereas all Armenian genotypes were grouped into II cluster: 2 into IIA and one into IIB.

Second, the dendrogram revealed the existence of isolated accessions that were genetically distinct in several countries: the Azeri accession in cluster IV, the Tajik ones in clusters V and VI.

Third, the germplasm from Azerbaijan proved to be a group of high diversity, as revealed by the low mean pairwise genetic similarities. Accessions from this country are scattered over four clusters. The high variability among Azeri accessions is possibly result of different morphological groups (Two macrosperma and three microsperma) as well as diverse provinces covered (between 38°45′ and 40°38′ in the North, and 47°02′ and 48°49′ in the East) (see Table 1).

No differentiation along Kazakhstan and Georgia could be studied because of the low sample size in these countries. Other authors (Lelley et al. 2000) have encountered similar problems in the study of genetic distance.

Our results confirmed that the microsatellites could be powerful tools for study between cultivar diversity in lentil. The resolution of fingerprinting by current five SSRs was rather high for the 39 lentil germplasm used in the study. Except for some closely related germplasm with same collection place in Tajikistan, most of the germplasm were distinguished only by five reactions of PCR. A reason might be carefully chosen SSR markers with different linkage groups and their higher number of alleles. Therefore, more attention should be focused on the establishment of a genotype database using convenient and effective indicators such as SSR, and hence monitoring the dynamic change in the gene pools and as a reference for germplasm management and breeding strategies. The fewer number of PCR can be multiplexed by using the fluorescent detection for automatic fingerprinting.

Since the germplasm information of lentil germplasm in CAC countries were not enough, further comparison of genomic constitution as well as the trait evaluation will be necessary in the future. Information on current levels of genetic diversity of germplasm at gene bank is essential for designing appropriate strategies for future conservation. Especially, the DNA fingerprinting information gives important strategies for genebank conservation in a country, e.g. Azerbaijan, and germplasm introduction from outside of the country in the breeding program. According to the results of the present study, the high genetic differentiation among accessions from Azerbaijan suggests that this genepool should be well represented by more samples/accessions ex situ.