Introduction

Enset (Ensete ventricosum (Welw.) Cheesman) is perennial herb plant having a close resemblance with banana, but is distinguished by its giant size, dilated bases and the edible parts are the starchy corm and pseudo-stem (Brandt et al. 1997). It is a diploid species with chromosome number of 2n = 18, and belongs to the order Zingiberales, family Musaceae and genus Ensete (Tomlinson 1969). About 20 species are recognized within the genus Ensete, but Ensete ventricosum is the only economically important species (Baker and Simmonds 1953; Taye 1993). Entirely, enset is vegetatively propagated through production of suckers (Yemataw et al. 2018) although it produce seeds.

Enset is a food security crop in which every part of the plant is utilizable (Bizuayehu 2002; Brandt et al. 1997). It is drought tolerant and can be harvested throughout the year and stored for long without spoilage (Brandt et al. 1997). More than 20% (nearly 18 million) of Ethiopian population depends on enset for food, fiber, animal forage, construction materials and medicines (Brandt et al. 1997; Zippel 2002; Central Statistical Authority 2010). According to Central Statistics Authority (CSA) 2010 report of Ethiopia, a total of 395,632 hectare of land is covered by the crop and yielded 8015,531 quintals.

However, enset production is declining in terms of productivity and area coverage and facing genetic erosion because of severe drought, diseases and population pressure (Tsegaye and Struik 2002; Brandt et al. 1997; Tenaye and Geta 2009; Almaz et al. 2002). Bacterial wilt disease caused by Xanthomonas campestris Pv. musacearum is the most threatening problem to enset production system (Bobosha 2003). Root lesion nematode (Pratylenchus goodeyi) and root knot nematodes (Meloidogyne sp.) are also commonly attacking the crop (Peregrine and Bridge 1992). Due to low research attention, the crop is the least studied and no effective technology is delivered to the farming community, and hence, the mode of production remained conventional (Bobosha 2003).

Genetic diversity study of the available enset germplasm would help to improve the crop in subsequent breeding programs through exploiting the genetic variation in the landraces, and effective genetic resources conservation. Most of the enset genetic diversity studies, so far, mainly focused on morphological markers (Bizuayehu 2002; Olango et al. 2014; Fetta 2007; Worku 1996). There are also few molecular studies to assess genetic diversity of enset landraces in Ethiopia using AFLP (Negash et al. 2002), RAPD (Birmeta et al. 2002), ISSR (Tobiaw and Bekele 2011) and SSR (Olango et al. 2015; Getachew et al. 2014) markers. However, these molecular diversity studies did not include enset landraces which are found in Gurage zone. Gurage district, with diverse agro-ecology and agricultural production systems. Gurage is among the high potential enset producing areas in Ethiopia, where it is rich in landrace diversity (Fetta 2007; Worku 1996). Therefore, the study was carried out with the objectives of describing the genetic diversity of Enset landraces of Gurage zone using simple sequence repeat (SSR) markers that will allow the search for bacterial wilt and nematode resistance and other enset improvement researches.

Materials and methods

Plant materials and total genomic DNA extraction

A total of 83 leaf samples (79 cultivated and 4 wild enset germplasms) were collected from Wolkite University (WkU) and Areka Agricultural Research Center (AARC) enset germplasm conservation sites (Table 1). The cultivated germplasms included: cultivated landraces originally collected from Gurage (exhaustively gathered), Yem Area (included for comparison) and released varieties from AARC (also included for comparison). The wild enset individuals were originally collected from forests of Dawro and Gurage zones and are not edible (Table 1). The leaf tissues were kept in 38% NaCl-CTAB solution with a ratio of 3.0 g CTAB and 35 g NaCl dissolved in 100 ml of distilled water (Storchova et al. 2000). After the NaCl-CTAB solution was thoroughly washed off with double distilled water at BecA-ILRI hub laboratory, isolation of total genomic DNA was done using Zymo Quick-DNA™ Plant/Seed Minprep Kit (The Epigenetics COMPANY™, USA). The quality and concentration of extracted DNA were estimated using the NanoDrop 2000c Spectrophotometer (Thermo Fisher Scientific, Walthum, MA) and visualized and photographed under gel documentation system (Azure Biosystems, c280 CA, USA) after fragmented at 5 V/cm in 0.8% agarose gel. DNA was stored at − 20 °C until further use.

Table 1 Enset plant materials used for genetic diversity analysis

SSR analysis

A total of nineteen SSR markers, thirteen of which developed from enset (Olango et al. 2015), and six cross-transferred from banana (Lagoda et al. 1998) and previously used for enset genetic diversity study (Getachew et al. 2014), were used in the study (Table 2). The SSR markers were tested for polymorphism level. The 5′ end of the forward primers of all the SSR loci were labeled with fluorescent dyes (6-FAM = blue, PET = red, VIC = green, and NED = yellow).

Table 2 SSR markers used in the study with their repeat motif and size

SSR genotyping

The amplification of each SSR loci was done using a standard PCR (Eppendorf Mastercycler nexus GSX1) to determine the appropriate annealing temperature. A four-primer multiplex PCR was designed based on their expected amplicon size and dye colour. The PCR program was composed of a volume of 15 μl containing 3 μl of genomic DNA (15 ng/μl), an average of 0.5 μl of each primer (2 pm/μl) (Sigma-Aldrich), 7.5 μM AccuPower PCR PreMix without dye (Bioneer, Daejeon, Republic of Korea) and additional of 0.3 mM MgCl2 (Promega, Venlo, The Netherlands). The PCR amplification program consisted of an initial denaturing step of 94 °C for 3 min; followed by 30 cycles of 94 °C for 30 s, 57 °C annealing temperature for 1 min, 72 °C for 2 min; and a final extension step of 72 °C for 20 min, using a GeneAmp® PCR System 9700 thermocycler (Applied Biosystems, Foster City, CA). The PCR products were fragmented at 7.1 V/cm in 1.5% agarose gel (in 0.5% TBE stained with 2.5 µl GelRed/100 ml of buffer) and finally the gel was visualized and photographed under gel documentation system (Azure Biosystems, c280). Samples were prepared for capillary electrophoresis in LIZ-HiDi cocktail solution in a proportion of 1 ml of Hi-Di™ Formamide (Applied Biosystems®, Thermo Fisher Scientific, Waltham, MA, USA) and a 15 μl of GeneScan_500 LIZ® Size standard (Applied Biosystems®, Thermo Fisher Scientific, Waltham, MA, USA). Multiplex PCR products with a volume of 1.5 μl were diluted to 9 μl of LIZ-HiDi solution. The mixed products were denatured at 95 °C for 2 min in a GeneAmp® PCR System 9700 (Applied Biosystems) thermocycler and snap-chilled on ice for 5 min and sent to Segolip laboratory of ILRI-BecA/Kenya for capillary electrophoresis. The generated data were then analyzed manually using the GeneMapper® Software version 4.1 (Applied Biosystems®, Thermo Fisher Scientific, Waltham, MA, USA) and the allele size was scored in base pairs (bp) based on the relative migration of the internal size standard.

SSR polymorphism and genetic diversity analysis

Polymorphic SSR markers were used to analyze the genetic diversity of the 83 studied enset germplasms. The basic statistics, such as observed number of alleles (Na), number of effective alleles (Ne), number of private alleles (Npa), unbiased expected heterozygosity (uHe), observed heterozygosity (Ho), expected heterozygosity (He) and polymorphic information content (PIC) were computed using GenAlEx 6.503 (Peakall and Smouse 2012). GenAlEx 6.503 was also used to estimate genetic differentiation using Analysis of Molecular Variance (AMOVA) to partition total genetic variation into within and among population subgroups. Gene flow (Nm) that describes the average number of individuals per generation migrating between populations will be calculated by Nm = {(l/FST) − 1}/4 using GenAlEx and it will be taken as the major determinant of population structure when Nm > 0.5 where FST is the standardized genetic variance among (sub)populations in the total sample (Wright 1951). Because the sample size taken from the studied populations was different, rarefied allelic richness (Ra) and rarefied private allelic richness (Rpa) per population were estimated using rarefaction procedure implemented in the program HP-Rare 1.1 (Kalinowski 2005) to clearly describe allelic diversity. To investigate population differentiation, fixation index (FST) of the total populations and pairwise FST among all pairs of populations were computed, and significance was tested based on 1000 bootstraps. To estimate the discriminatory power of the SSR markers, the PIC for each marker was computed by PIC = 1−∑Pi2, where Pi2 referred to the sum of the ith allelic frequency of each microsatellite locus for the genotypes (Nei 1987). Discriminant Analysis of Principal Components (DAPC) were implemented in the adegenet package version 1.4.1 in R (Jombart 2008). Clustering was performed using neighbor joining method with arithmetic average (Nei 1972) using Darwin V6.0 (Perrier et al. 2003) to generate a phylogenetic tree with branch robustness tested using 1000 bootstraps. The pattern of population structure and detection of admixture were inferred using a Bayesian model-based clustering algorithm implemented in STRUCTURE version 2.3.4 (Pritchard et al. 2000), Structure Harvester (Earl and vonHoldt 2012), CLUster Matching and Permutation Program (CLUMPP) version 1.1.2 (Jakobsson and Rosenberg 2007) and online application of Structure Plot v2 (Ramasamy et al. 2014).

Results

Genetic diversity

Out of the total 19 SSR tested markers, seven-two from enset (Evg-03 and Evg-05) and five from banana (pMaCIR 431, mMa-3-90, mMaCIR 24, and pMaCIR 232) were found to be monomorphic. Hence, the remaining twelve (Table 3) were selected to evaluate allelic diversity of the selected 83 enset germplasms. All primers used yielded an average Polymorphic Information Content (PIC) value of 0.69 ranging from 0.62 to 0.77 (Table 3). The selected twelve polymorphic SSR markers detected a total of 77 alleles (58 being different) in the studied 83 enset germplasms. The mean allelic richness per locus was 6.4, ranging from 3 (Evg-11) to 12 (Evg-08) alleles (Table 4).

Table 3 The 12 SSR markers used in the study with number of alleles, size range and PIC values
Table 4 Diversity parameters estimated for enset populations using 12 SSR markers

The genetic diversity indices for the five studied enset populations are summarized in Table 4. The West Gurage (WG) population had the highest number of different allele (Na), number of effective allele (Ne) and number of private allele (NPa) values. Similarly, expected heterozygosity was highest in WG and Wild (W) populations. Relatively higher Shannon’s Index (I) was observed in both East and West Gurage (I = 1.17) populations while it was least in Yem (I = 0.93) population. The percentage polymorphic loci (PL) ranged from 91.7% (RV, W and Y) to 100% (EG and WG), with a mean value of 95%. The highest fixation index (F) value of 0.17 was recorded for Wild Accession population; while negative values for the same parameter (− 0.02 to − 0.14) was identified for the cultivated populations. However in situations with populations under investigation possess unequal size, rarefied allelic richness and rarefied private allelic richness are better to clearly describe and compare the genetic diversity between the studied populations. Accordingly, Wild enset accessions exhibited a higher rarefied allelic richness (3.44) and rarefied private allelic richness (0.74) than all the rest studied enset populations.

Population genetic structure and gene flow

Analysis of molecular variance (AMOVA) showed 93% total variation due to observed variation within enset individual landraces. Only 4% of the total molecular variance was ascribed to the five population groups and 3% due to genetic variations among accessions in each population group (Table 5). The genetic differentiation among populations (FST = 0.037 at P < 0.001) was highly significant. The average estimate of the gene flow among populations (Nm) was 3.35, indicating high gene flow between populations.

Table 5 AMOVA among and within populations of enset populations based on the 12 SSR markers

The average genetic distances and similarities between the studied five enset populations is presented in Table 6. The greatest genetic distance (0.191) was observed between East Gurage and Wild Accession populations, and between West Gurage and Wild Accession populations (0.153). On the other hand, the least genetic distance (0.003) was detected between genotypes from West and East Gurage populations. The mean genetic distance among the studied germplasm was 0.17, ranging between 0.00 and 0.33. The highest genetic distance of 0.33 and 0.32 was observed between Erpha 88 (wild) and Anichara (cultivated) and Furinzir (wild) and Yezire-badedet (cultivated) germplasms, respectively. The least genetic distance (0.00) was observed between different cultivated enset genotypes. Cultivated enset landraces that did not differ in their SSR profile for the tested primers (with 0.00 genetic distance) include: (1) Ashakit and Shertiye; (2) Agade and Buangye-agade; (3) Adiya-ehire, Bekuret and Chehuyet; (4) Benet, Gaznar and Sapara; (5) Genbenye and Shebe-shertiye; (6) Kekere and Yedemert; (7) Yeshiraqinqe and Gubira and (8) Arka and Chama (Fig. 1). Genetic identity and distance between the studied enset germplasms are also clearly displayed in minimum spanning network (Fig. 2).

Table 6 Nei’s genetic distance (below diagonal) and Nei’s genetic identity (above diagonal), among five population groups
Fig. 1
figure 1

Unweighted neighbor-joining phylogeny Bruvo distance tree based on 12 SSR markers for the 83 studied landraces of enset

Fig. 2
figure 2

Minimum spanning network based on the twelve SSR markers for the 83 studied landraces of enset

Unweighted neighbor-joining dendrogram (Fig. 1) clustered the studied 83 enset germplasms into three definite populations. Discriminant Analysis of Principal Components (DAPC) showed that the studied five enset populations were also grouped into three inferred clusters (K = 3) (Fig. 3a). The first cluster placed all the wild enset accessions distinctly away from the four cultivated enset landraces; while the second cluster included most of the released varieties, and the third cluster contained all the cultivated enset landraces and few of the released varieties (Fig. 3b). No distinct cluster pattern was observed within the cultivated landraces taken from different geographical locations.

Fig. 3
figure 3

Population structure based on 12 polymorphic SSR markers indicating estimated group structure with individual group membership values (1-83 following arrangement of landraces in Table 1) (a) and DAPC scatter plot for the studied 83 enset germplasms (b)

Discussion

A polymorphic information content (PIC) value ≥ 0.5 is considered highly informative and useful in distinguishing degree of polymorphism (Prabakaran et al. 2010). The PIC values detected by the 12 SSR markers were in the range of 0.62–0.77; making them highly informative. Genetic distance is described by the observed total number of independent alleles (Kalinowski 2002). The 12 SSR markers detected a total of 77 alleles in the studied enset germplasm. This shows the recorded number of alleles is high compared to 61 alleles identified from 220 enset germplasm, using 11 Musa markers (Getachew et al. 2014). This number of alleles might be considered low relative to the 202 alleles detected from 60 cultivated enset landraces and six wild germplasm, using 34 enset SSR markers (Olango et al. 2015). The observed allele variation can be attributed to the use of different types and number of genotypes and primers applied in each study.

Among the studied enset populations, the Wild enset population was found to be relatively more diverse than others as it retained the highest value for rarefied allelic richness and rarefied private allelic richness values. Whereas, the rest four studied populations were to possess equivalent values indicating that they are almost at similar level of genetic diversity.

Shannon’s diversity index of studied enset populations was > 1, implying the existence of high genetic diversity. The genetic diversity of the whole studied populations, as expressed by the mean expected heterozygosity (He = 0.59), is exactly similar with the genetic diversity (He = 0.59) recorded on enset (Olango et al. 2015) using enset SSR markers. Moreover, it is slightly higher than the genetic diversity (He = 0.55) recorded on enset diversity study, using cross-transferred Musa SSR markers (Getachew et al. 2014). The level of genetic diversity estimated in this study is also higher than previous reports of enset genetic diversity estimated using other DNA markers of ISSR (Tobiaw and Bekele 2011), AFLP (Negash et al. 2002) and RAPD (Birmeta et al. 2002). However, it is hard to make direct comparison between these studies, and draw general conclusions, since the number and type of the studied germplasm and the DNA markers used were different. The higher fixation index (F) value of 0.17 was recorded for Wild population, which indicates population differentiation, as the Wild population is sexually multiplied by seeds. Negative values (− 0.02 to − 0.14) of fixation index (Table 4) for the same parameter was identified for the cultivated populations, which is taken as FST = zero indicating high amount of gene flow between the studied populations equalizing the allele frequencies.

The AMOVA revealed that the variations within landraces of enset contributed to the majority (93%) of the total molecular variance; while the five population groups and the variations among landraces in each population group, contributed 4 and 3% of the total molecular variance, respectively. This result is consistent with the previous enset diversity studies (Olango et al. 2015; Getachew et al. 2014; Birmeta et al. 2004; Negash et al. 2002; Tobiaw and Bekele 2011). The within population variation was very high, which indicates the need to stress within-population diversity in the conservation and exploitation of the existing enset diversity in the study area. Enset landraces of the five populations were sub-divided into three inferred cluster populations. Structure analysis, DAPC, and the Unweighted Neighbor-joining algorithm shown admixtures among the five populations. The observed low FST value (0.037) between the enset populations in this study indicated low differentiation among the groups that might be attributed to gene flow among regions. A low degree of differentiation in populations of enset shown in this study might be attributed to the exchange of planting material irrespective of geographical distances (Tesfaye and Ludders 2003). These may cause migration and gene flow between populations that resulted in admixture among enset populations.

The DAPC scatter plot (Fig. 3a, b) revealed a high level of differentiation between wild and cultivated enset germplasms. Other enset diversity studies have also reported a high level of genetic dissimilarity between cultivated and wild enset germplasm (Birmeta et al. 2004; Olango et al. 2015). The lowest (0.003) genetic distance was identified between West and East Gurage enset populations that showed a closer resemblance to one another that might be explained by frequent germplasm exchange, which might be due to their high cultural relations, despite the Zebidar mountain barrier between the two locations. In the study, duplication was found, as some of the cultivated enset landraces did not differ in their SSR profile for the tested primers (with 0.00 genetic distance), including Ashakit/Shertiye, Adiya-ehire/Bekuret/Chehuyet, Kekere/Yedemert, Genbenye/Shebe-shertiye, Agade/Buangye-agade, Yeshiraqinqe/Gubira, Benet/Gaznar/Sapara and Arka/Chama.

Conclusion

The mean diversity indices (He = 0.59, Ho = 0.61, I > 1) confirmed the existence of high genetic diversity in the studied Gurage and surrounding enset landraces. The studied wild and cultivated enset genotypes were genetically diverse. Similarly, most of the released varieties also genetically varied with the cultivated landraces. The West and East Gurage enset population showed a closer resemblance to one another, which might have resulted from germplasm exchange, due to their high cultural ties, though geographically separated by Zebidar mountain. In the present study 18 enset landraces found to be grouped into eight genotypes as they possessed identical genetic makeup across the studied loci indicating that they are simply duplications of one another. The result contributes important information for the implementation of appropriate conservation strategies and breeding programs of enset genetic resources for the basic enset production constraints such as nematode and bacterial wilt diseases and overall productivity improvement of the crop.