Introduction

Common bean (Phaseolus vulgaris L.) is an annual, self-pollinated crop which is grown on more than 12 million ha in the world. Common bean provides a cheap source of protein in the developing world, especially in Latin America and Africa (CIAT 1989; Graham and Ranalli 1997). The common bean was domesticated independently 8,000 to 10,000 years ago in South America (The Andes) and 7,000 years ago in Mesoamerica (Kaplan 1981). Bean type classification into two gene pools was made based on morphological traits and phaseolin seed proteins by Gepts et al. (1986) and Gepts (1988), and based on morphological characters and allozymes by Singh et al. (1991c). The two gene pools were further differentiated into races using agro-morphological traits (Singh et al. 1991a) and later confirmed by different types of DNA markers such as random amplified polymorphic DNA (RAPD; Beebe et al. 2000), amplified fragment length polymorphism (AFLP; Beebe et al. 2001), and microsatellites (SSR; Blair et al. 2003, 2006, 2009; Diaz et al. 2011).

In 1916, the Russian geneticist Nicolai Vavilov organized a plant collecting expedition to Central Asia (Tajikistan, Kyrgyzstan and Uzbekistan) and brought two cultivated P. vulgaris samples to the VIR institute from the Pamir Mountains (Tajikistan). Common bean cultivars in the former Soviet Union were initially developed from cultivars collected in the country and foreign breeding material and cultivars introduced from 1921 to 1923. The introduced material originated from American and Canadian breeding stations and seed companies: Hidatsa red, North Dakota, Refugee, Valley Seed Co, Sacramento, Cal and others (Buravtseva and Egorova 2012).

Common bean cultivars were most likely introduced to Central Asia (including Kyrgyzstan) by the Soviets during the last century (Hegay et al. 2012). When Kyrgyzstan achieved its independence in 1991, the agricultural land belonged to the State but this changed after the privatization process. From 1991 to 1996 collective farms (kolkhozes and sovhozes) were transformed into private farms. About 344,500 small-scale farms are registered today and they own together 1.28 million ha, (6.4 %) of the agricultural land in Kyrgyzstan (STATCOM 2011). The majority of the population depends on agriculture. Farmers grow cereals crops like wheat and barley, but small-scale farmers from the Talas and Chui oblasts are increasingly switching to common beans. Consequently, in these two oblasts the population meets the food calorie and protein requirements when compared to other oblasts (Asanaliev and Nurgaziev 2012).

The Kyrgyz common bean market started to develop in the end of the twentieth century. In 2010, 71,400 t of beans were produced (FAOSTAT 2010), and 90 % of the harvest was exported mainly to Turkey, Bulgaria and Russia (STATCOM 2011). Kyrgyzstan has a moderate bean production compared with other grain-bean producing countries (Beebe et al. 2011), but ranks however among the top 20 bean grain exporters worldwide (Akibode and Maredia 2011). The income from selling common beans (grains) is 1 billion Kyrgyz soms (approx. US$ 20 million; FAOSTAT 2009). Kyrgyz farmers grow different types of market bean classes, and sometimes use cultivar mixtures because they believe these will give a higher yield. Furthermore, the market price for different types of seeds is not stable from year to year, which also supports these cultivation practices.

The objective of the present study was to assess the diversity of Kyrgyz cultivars and a reference set of foreign common bean accessions using morphological qualitative traits and compare the results with previously published microsatellite marker data. The ultimate goal is to identify genetic variation useful for the Kyrgyz bean breeding program.

Materials and methods

Plant material

Five Kyrgyz cultivars were selected since they are widely grown in Kyrgyzstan. Seeds of foreign accessions were kindly provided by Michigan State University (East Lansing), and the United State Department of Agriculture (Pullman). Altogether, 27 accessions (Table 1) were characterized using morphological trait descriptors described by Singh et al. (1991a) (Table 2).

Table 1 Common bean accessions, their country of origin, gene pools of origin (Andean (A) and Mesoamerican (MA)) and diversity parameters estimated based on morphological data
Table 2 Diversity in 27 common bean accessions estimated based on qualitative morphological traits

Data analysis

Data for 13 qualitative morphological traits were recorded on 10 randomly chosen individual plants per accession (Table 2). Qualitative morphological traits were binary-coded as 1 for presence or 0 for absence for each individual plant (e.g., pod beak position placental: presence (1) or absence (0); pod beak position central: presence (1) or absence (0) since common bean is a self-pollinated crop and we did not expect to find any heterozygotes. The Shannon diversity index (I) and percent polymorphism (%P) were calculated for each accession using POPGENE version 1.31 (Yeh and Boyle 1997).

Previously published microsatellite data (Hegay et al. 2012) for the accessions used in this study were also binary-coded and combined with the morphological data for cluster, principal coordinate analysis (PCoA) and STRUCTURE analyses in order to obtain a better genetic information about the common bean accessions. Cluster analyses were performed based on Dice’s similarity coefficient (Dice 1945) according to the unweighted pair group method with arithmetic average (UPGMA) using the sequential agglomerative hierarchical nested clustering (SAHN). The analyses were done using NTSYS-pc and FreeTree software (Rohlf 2000; Pavlicek et al. 1999). Principal coordinate analyses (PCoA) were performed based on the simple matching coefficient since this coefficient takes into account both the shared presence and absence of a particular character when estimating the similarity between two individuals. A two-way Mantel (1967) test was used to test the hypothesis of an equal precision for genotypic and phenotypic data to classify the bean accessions into a gene pool. The goodness of fit for the UPGMA trees and PCoA (using Dcenter and Eigenvectors) matrices were performed using NTSYS-pc with 10,000 random permutations. The bootstrap values for the UPGMA dendrograms were obtained via a 1,000 resampling procedure using the FreeTree program (Pavlicek et al. 1999). The TreeView program (Page 1996) was used to display the trees. The software STRUCTURE (Pritchard et al. 2000) was used for structure analysis based on the combination of morphological and microsatellite data. The admix model with 5,000 burning periods and 50,000 replicates was used to estimate each K value, with ten independent runs from K = 1 to 10. Delta K (population number) was estimated as described by Evanno et al. (2005), and population clusters were produced using the DISTRUCT software (Rosenberg 2004).

Discriminant analysis (DA), principal component analysis (PCA) and the best subset regression were used to estimate the diversity among accessions and their grouping, and were performed using the Minitab 15 statistical software (Minitab Inc 2008). DA was used to distinguish between accessions and divide them into groups based on morphological traits. DA grouped accessions with typical characters, and estimated the correct and incorrect percentage of classifications (Dytham 2011). DA maximizes differences between classes while minimizing those within classes, which is different from the PCA. PCA was used to analyze the diversity and to identify the optimum number of morphological traits which explain a high proportion of the variability. The Scree plot was used to display Eigenvalues and number of morphological traits in PCA. The best subset regression statistical method was used to determine a model for the grouping of accessions based on morphological traits.

Results

Diversity within and among common bean accessions

The average Shannon diversity index within common bean accessions estimated based on qualitative morphological traits was 0.05 (Table 1). Among the 27 accessions, only four were polymorphic for the thirteen traits. Accession PI527537 from Burundi had the highest diversity and the highest percent polymorphism. No polymorphism in all traits was observed in the Kyrgyz bean cultivars. On average, a higher genetic diversity was observed in the 12 Mesoamerican accessions (Shannon index of 0.08) as compared to the 15 Andean accessions (Shannon index of 0.02). The pair-wise comparisons of Dice’s coefficient of similarity between accessions estimated based on the qualitative morphological traits ranged from 0.154 (between PI543043 and PI451886) to 0.923 (between Kytayanka and Lopatka).

Grouping of accessions based on the qualitative morphological traits

The principal component analysis (PCA) showed that three of the thirteen morphological traits were the most important components for explaining the grouping of accessions (Fig. 1). To identify these three most important traits, a subset regression analysis was performed. The analysis determined seed size, pod beak position, size and shape of the bract as the most important with a correlation of R 2 = 79 % (Table 3). Morphological traits as plant growth habit and pod string may be included as secondary predictors, while the other eight morphological traits were not identified as important by this analysis. A discriminant analysis based on morphological traits was used to differentiate between Andean and Mesoamerican accessions and to assign them to the correct group (Table 4). Seed size and pod beak position were effective for the grouping of the accessions. Andean accessions had medium, large or very large seed sizes and a central beak position, while Mesoamerican accessions had small and medium seed sizes and a placental beak position. Overall, the morphological traits included in this study were able to properly assign 99 % of individuals into their respective gene pools.

Fig. 1
figure 1

Scree plot of principal component analysis (PCA) showing the number of morphological traits and their importance for grouping of accessions into common bean gene pools. Together, the first three principal components accounted for 46 % of the total variance

Table 3 The best three predictors (seed size, bract shape and size and pod beak position; shown in bold) that together explained 79 % of the total variation in 27 common bean accessions
Table 4 Discriminant analyses of the grouping of common bean accessions into the Mesoamerican and Andean gene pools based on morphological characters

Two main groups were observed in the UPGMA cluster analysis with 100 % bootstrap support (Fig. 2). The first group (cluster I) included the 12 Mesoamerican accessions and the second group (cluster II) consisted the 15 Andean accessions. The principal coordinate analysis based on a combination of microsatellite and morphological data grouped the accessions into two main clusters (Fig. 3). The goodness of fit of matrix comparisons (phenotypic versus genotypic data) was r = 0.49 (P < 0.01). The first and second co-ordinate explained 52 % of the total variation. Cluster Ia and Ib included accessions belonging to the Mesoamerican gene pool, and cluster II comprised accessions that belong to the Andean gene pool (Fig. 3). A UPGMA dendrogram generated based on the combination of microsatellite and morphological data (Fig. 4) was similar with that constructed based only on morphological data (Fig. 2). In both dendrograms, the accessions were grouped into two main groups with high bootstrap support. A two-way Mantel test showed a highly significant correlation of the cophenetic values from the two independent UPGMA cluster analyses (r = 0.95, P = 0.01). The cluster analysis with K = 2 using Evanno’s methods of STRUCTURE defined two groups of accessions corresponding to the Mesoamerican and the Andean gene pools, and showed that accession PI527537 was a mixture according to both morphological and molecular data (Fig. 5).

Fig. 2
figure 2

Dice’s similarity coefficient based unweighted pair group method with arithmetic average (UPGMA) cluster analysis that clustered 27 common bean accessions into Mesoamerican (cluster I) and Andean (cluster II) groups based on qualitative morphological data. The bootstrap value from 1,000 resampling is indicated in between two branches

Fig. 3
figure 3

Principal coordinate analysis based on combination of morphology and microsatellite matrices for 27 common bean accessions. There was a significant correlation between the matrices (r = 0.49, P < 0.01)

Fig. 4
figure 4

UPGMA dendrogram based on Dice’s similarity coefficient among 27 common bean accessions. The dendrogram was constructed based on a combination of morphological traits and microsatellite markers. There was a significant correlation between the two cophenetic matrices (r = 0.95, P = 0.01)

Fig. 5
figure 5

Population structure for 27 common bean accessions estimated by the STRUCTURE program. The comparison included nine microsatellites and 13 qualitative morphological traits. Accession names and country of origin are given at the top and bottom, respectively

Discussion

The average genetic diversity within common bean accessions estimated using morphological qualitative traits in the present study (0.05) was lower than that estimated with microsatellite markers (0.07) (Hegay et al. 2012. In the present study, common bean accessions were clearly separated into two groups corresponding to the Mesoamerican and Andean gene pools. There was, however, no large variation in a majority of the morphological traits among the accessions from the two gene pools. This is probably due to the fact that many of the accessions were received from gene banks and may suffer from a reduction of polymorphism in the past; i.e., accessions with traits not amenable to industrial bean production have been eliminated.

Qualitative morphological traits and appropriate statistical methods such as PCA, DA and STRUCTURE were used to assign accessions into gene pools, which agreed with those made by Burle et al. (2011). DA, best subset regression analysis and PCA identified seed size, pod beak positions and size and shape of the bract as important for grouping of accessions into gene pools, which agree with the results by Singh et al. (1991a). Variation in pod string and growth habit can also be suitable traits for gene pool classification (Table 3). As a whole, qualitative morphological traits were useful when differentiating accessions and assigning them into respective gene pools (Table 5).

Table 5 Pair-wise comparisons based on Dice’s similarity coefficient between 27 common bean accessions based on morphological traits

The results from the present study were in line with previous study by Singh et al. (1991b), who reported that variation of morphological characters in common beans could be independent variables and that the same morphological pattern can be found in different gene pools. The high correlation of the cophenetic values from the UPGMA analyses (r = 0.95) agrees with Kumar et al. (2008), who found a high correlation value (r = 0.934) for the clustering pattern when AFLP markers were used for diversity analysis in Indian common bean accessions.

The similarity coefficients of Jaccard and Dice were highly correlated with the simple matching coefficient, which was visually demonstrated in the analysis of P. vulgaris (Beharav et al. 2010; Duarte et al. 1999), and the use of either of these coefficients did not affect the grouping of common bean accessions into their genetic origin. In the present study, cluster analysis (UPGMA) and PCoA grouped bean accessions based on their genetic relationships or morphology rather than by country of origin, which agrees with previous research (Islam et al. 2002; Sharma et al. 2013). Principal coordinate analysis based on microsatellite data separated most accessions into two groups and only one accession (PI527537 from Burundi) was intermediate. The UPGMA dendrogram supported the PCoA grouping with an intermediate placement of accession PI527537. Seed mixtures, which are preferred by local consumers in Burundi (Wortmann et al. 1998) or the result of natural crosses between individuals from the two gene pools, could explain the placement in between two gene pools, which was suggested also by Hegay et al. (2012). In addition, we found that accession PI337090 from Brazil showed variability for both morphological and molecular data (Table 1); but, unlike accession PI527537, it did not include genotypes from the Andean gene pool (Fig. 5). This result was not surprising, because the small-scale farmers represent the Brazilian common bean industry and they usually grow commercial varieties together without purity control of the varieties (Burle et al. 2011).

The STRUCTURE analysis verified the two gene pools and grouped 12 accessions into the Mesoamerican gene pool and 15 accessions into the Andean gene pool. This clear clustering suggests that the recombination between the two gene pools is limited. Hybridizations between gene pools depend on the presence of the complementary dominant Dl 1 and Dl 2 genes which control traits that provide barriers between common beans of different geographic origins (Singh and Gutierrez 1984).

There was a moderate significant correlation between matrices derived from qualitative morphological traits and microsatellite data (r = 0.49, Fig. 3), which agrees with results from other matrix comparisons (r = 0.50) by Asfaw et al. (2009). However, the presence of two gene pools was more strongly supported by microsatellite data than by data from morphological traits. Microsatellites occur in coding and non-coding regions that are not always linked to the genes expressing morphological traits. Hybrid phenotypes after introgressing genes from one gene pool to another are difficult to differentiate based on morphology (Paredes and Gepts 1995).

Whether to use only morphological traits (both qualitative and quantitative) or to combine them with DNA markers is a matter of research aim and practical applications. For example, gene bank curators and plant breeders often use a combination of (usually more than 10) morphological traits and DNA markers for germplasm characterization, evaluation and utilization. In the present study, the use of only qualitative morphological traits was sufficient for separation of common bean accessions and for assigning them into two main gene pools, but the pattern was even more distinct when molecular markers were added. The number of morphological traits (13) agrees with the number of traits used for common bean in ex situ preservation in gene banks (Chiorato et al. 2006), or in situ preservation on farm management, and for diversity research (Gomez et al. 2005).

In conclusion, common beans characterized both with morphological traits and microsatellites were grouped into clusters corresponding to their gene pools of origin. Kyrgyz cultivars belonged to both Andean and Mesoamerican gene pools as previously shown by Hegay et al. (2012). Classification and divergence between common bean accessions analyzed in this study may help to preserve plant material both in situ and ex situ. Furthermore, our study provides important information to the Kyrgyz breeders that helps to optimize the selection of plant material to be used in breeding programs of this very important grain legume crop.