Keywords

2.1 Introduction

The common bean, Phaseolus vulgaris L. (2n = 2x = 22), is a crop species that in addition to the importance of its societal, nutritional and environmental sustainability has a very interesting and distinctive evolutionary history. Indeed, it appears to represent a unique model for domestication and evolutionary studies. This also arises through the domestication of five Phaseolus species and the occurrence of multiple independent domestications in Mesoamerica and the Andes for both P. vulgaris and Phaseolus lunatus (for review, see Bitocchi et al. 2017).

The common bean wild forms originated in Mesoamerica about 165,000 years ago and spread southwards towards the Andes (Bitocchi et al. 2012; Schmutz et al. 2014). At least two of its independent domestication events determined the formation of two distinct domesticated gene pools that evolved under isolation, one in Mesoamerica and one in the Andes. These gene pools underwent parallel evolution that was associated with partial reproductive incompatibility (i.e. low hybrid fertility due to gene conditioning hybrid weakness and breakdown (Johnson and Gepts 1988; Koinange and Gepts 1992; Singh and Molina 1996)), and they spread further through the development of landraces with distinct characteristics and specific adaptations. Such distinct and replicated domestication events that occurred for the same species (or the same genus) that led to morphological and functional changes represent an almost unique experimental feature for evolutionary studies. This is different from other examples of multiple domestications events (Meyer et al. 2012) that were not independent due to the lack of reproductive isolation (Bitocchi et al. 2017). However, some similarities can be seen in rice with the indica and japonica subspecies (Vitte et al. 2004; Londo et al. 2006; but see also Molina et al. 2011; Choi et al. 2017).

For these reasons, the common bean is an ideal model to study domestication and evolution, and the present review aims to cover the current knowledge of its evolutionary history. This provides an analysis of the process of domestication, with the focus on convergent phenotypic evolution. It also highlights current knowledge of the genetic control of the domestication syndrome from the perspective of the new era that is associated with the release of both the Mesoamerican (Vlasova et al. 2016) and Andean (Schmutz et al. 2014) reference genome sequences.

2.2 Origins of Phaseolus vulgaris

P. vulgaris originated in America, and specifically in Mesoamerica, in the state of what is now Mexico (Bitocchi et al. 2012). The wild form remains widely distributed from northern Mexico to north-western Argentina (Toro et al. 1990), and it is characterised by three eco-geographical gene pools. The Mesoamerican and Andean gene pools are the main ones, and they show parallel wild and domesticated geographical distributions, as has been reported in several studies based on different datasets, which included plant morphology, seed proteins, allozymes, many different molecular markers and sequence data (Gepts et al. 1986; Gepts and Bliss 1985; Koenig and Gepts 1989; Singh et al. 1991; Becerra-Velásquez and Gepts 1994; Freyre et al. 1996; Papa and Gepts 2003; Rossi et al. 2009; Kwak and Gepts 2009; Bitocchi et al. 2012, 2013, 2016; Bellucci et al. 2014b; Rodriguez et al. 2016). The third gene pool was discovered in the 1980s and described in 1993 (Debouck et al. 1993). This gene pool comprised wild populations from northern Peru and Ecuador, which was suggested to be the region of origin of the wild form of the common bean (Kami et al. 1995). However, further data (Bitocchi et al. 2012) indicated that the common bean originated in Mesoamerica and that the other two wild gene pools originated from two independent migration events. The Mesoamerican origin was also supported by whole-genome sequencing analysis (Schmutz et al. 2014) that also estimated the divergence between the Mesoamerican and Andean gene pools at some 165,000 years ago.

Bitocchi et al. (2012) clearly defined the Mesoamerican wild population structure. Although previous studies had highlighted the presence of population structure in the Mesoamerican gene pool (e.g. Papa and Gepts 2003), they had not demonstrated any clear subdivisions into different sub-populations. On the other hand, by using sequence data, Bitocchi and collaborators (2012) demonstrated the presence of four distinct genetic groups, two of which were more related to the Andean (i.e. Mesoamerican B3) and the northern Peru and Ecuador (i.e. Mesoamerican B4) populations. From the Mesoamerican centre of origin, different groups migrated from central Mexico to South America, which led to the formation of the two South American gene pools, as Andean and from northern Peru and Ecuador.

In a recent paper, Rendón-Anaya et al. (2017) confirmed that the populations occurring in North Peru and Ecuador represent a distinct population that migrate in South America much earlier than the Andean gene pool. Moreover, these authors, analysing 29 accessions from 12 Phaseolus species, based on nuclear and chloroplast genome sequences and on metabolomics data, suggested that this third gene pool should be considered a sister species of P. vulgaris (Phaseolus pseudovulgaris, Rendón-Anaya et al. 2017). However, a larger sample of Mesoamerican accessions and further analysis, including hybridisation experiments, are needed to confirm the hypothesis of a new true species.

2.3 Domestication

Domestication is a complex process that starts from a wild plant population or several populations, and through adaptation and the shaping of the natural environment leads to a crop plant that is modelled on human needs and agricultural practices. The domestication process involves several morphological and physiological changes that result in genetic, structural and functional modifications that are shared among most crop species (i.e. domestication syndrome). These processes make the developing crop genetically different from its wild relatives and confer better adaptation to different agro-ecosystems (Gepts and Papa 2002; Bellucci et al. 2014b).

In the Phaseolus species, the main differences between the wild and domesticated forms are related to gigantism (e.g. small vs large seeds and pods), growth habit (e.g. more compact in the domesticated form, occurrence of bush structures and no climbing types), seed dormancy (i.e. present vs absent), photoperiod sensitivity (i.e. short-day vs complete or partial insensitivity), shape and colour of the plant and its harvested parts (e.g. seeds and pods) and the dissemination mechanisms (e.g. high shattering vs low shattering or non-shattering pods).

For the Phaseolus genus, domestication occurred as at least seven independent events (five species, and P. vulgaris and P. lunatus with two gene pools), with the convergent phenotypic evolution of very similar set of traits. The Phaseolus species have different breeding systems (i.e. autogamous, allogamous) and life history traits (i.e. annual, perennial), thus making Phaseolus an extremely interesting and unique model to study domestication dynamics and evolution under domestication (Bitocchi et al. 2017).

At the genome level, the main consequence of domestication, which is common to most crop species, is a reduction in their genetic diversity when compared to their wild ancestors. At the origins of agriculture, farmers selected a reduced number of individuals, and consequently they collected only a small portion of the diversity of the entire wild gene pool and populations, which is known as the ‘founder effect’ (Glémin and Battaillon 2009).

For the common bean, the process of domestication has been rather widely studied, and in particular detail. Major domestication traits have been mapped (Koinange et al. 1996), and some genes associated with domestication have been characterised (Kwak et al. 2008; Repinski et al. 2012). Compared to the wild gene pool, it is well known that strong reductions in the genetic diversity in the Mesoamerican and Andean domesticated populations of the common bean have occurred (Papa et al. 2005, 2007; Kwak and Gepts 2009; Rossi et al. 2009; Nanni et al. 2011; Bitocchi et al. 2012, 2016; Desiderio et al. 2013; Schmutz et al. 2014; Bellucci et al. 2014a; Rodriguez et al. 2016). Two independent domestication events have been reported in several studies, one in Mesoamerica and one in the Andes, where the two major domesticated gene pools originated (see Bellucci et al. 2014b), plus single domestication events that occurred within each gene pool (Nanni et al. 2011; Bitocchi et al. 2013). The two domesticated gene pools were differentiated according to their morpho-agronomic traits and biochemical patterns, and at the molecular level (Gepts et al. 1986; Koenig et al. 1990; Singh et al. 1991; Papa et al. 2006; Acosta-Gallegos et al. 2007; Bitocchi et al. 2013; Schmutz et al. 2014). When the domesticated forms are compared to the wild forms, the bottleneck due to domestication was three-fold greater in the Mesoamerican than in the Andean. Indeed, the Andean populations underwent a bottleneck before domestication that initially impoverished the genetic diversity of the Andean wild germplasm, which resulted in minor effects of the subsequent domestication bottleneck (i.e. sequential bottleneck; Bitocchi et al. 2013).

Together with the sequencing of the first reference genome of the common bean, Schmutz et al. (2014) reported the analysis of 60 wild genotypes and 100 landraces from Mesoamerica and the Andes, which confirmed the occurrence of two independent domestications from wild gene pools that had diverged before humans arrived in America. They also found diversity reduction in the Mesoamerican landraces, while for their Andean samples they proposed the occurrence of admixture events with Mesoamerican accessions and the emergence of new mutations because the landraces were more diverse than the wild populations (Schmutz et al. 2014).

The severe reduction in genetic diversity for the Mesoamerican accessions was also observed at the nucleotide level (Bellucci et al. 2014a) when the transcriptomes of wild and domesticated common bean accessions were analysed. By exploiting the RNA-seq technique and de novo assembly, Bellucci et al. (2014a) showed a reduction at the phenotypic level (i.e. gene expression) in the domesticated form when compared to the wild form. For the first time in a crop species, they showed that the reductions highlighted at the transcriptomic level decreased the phenotypic diversity at the gene expression level by about 18%. Moreover, they reported that, in comparisons between wild and domesticated forms, the majority (74%) of the contigs identified as differentially expressed were down-regulated in the domesticated forms. This suggested that the occurrence of loss-of-function mutations (which are relatively frequent compared to gain-of-function changes) was a common source of variation. This also supports selection during rapid environmental changes (Olson 1999), as they occur for the adaptation to an agro-ecosystem from the wild environment. These results also support the observation that most of the traits of the domestication syndrome are recessive.

At the genome-wide level, lower gene expression was found for the domesticated compared to the wild transcripts, as if there had been an accumulation of deleterious mutations due to hitchhiking, as mostly loss-of-function, or with reduced expression (Bellucci et al. 2014a); they referred to this as the ‘cost of domestication’. As suggested for rice (Lu et al. 2006), the accumulation of loss-of-function mutations, or reduced expression mutations, might also have been due to reduced effective recombination, which would result in increases in the frequency of deleterious mutations in the domesticated pool and would negatively influence the fitness.

In Mesoamerica and the Andes, the wild and domesticated forms grow under different levels of sympatry, and the effects on the population structure and gene flow were analysed by Papa and Gepts (2003) using Mexican populations. They demonstrated that the wild and domesticated common bean were not genetically isolated, with moderate gene flow detected, which was three-fold higher from domesticated to wild, than in the opposite way. In the presence of such levels of asymmetric gene flow, the high phenotypic differences between the two forms were maintained due to selection against domesticated alleles in the wild environment and against wild alleles in the cultivated agro-ecosystems (Papa and Gepts 2003). The presence of asymmetric introgression was also recently reported by Rendón-Anaya et al. (2017), who also confirmed, in agreement with Papa et al. (2005), that the level of introgression is higher in the genome area not involved in the genetic control of the domestication syndrome. One of the most important factors causing such asymmetry is the recessive nature of domestication traits that make the F1 hybrids, from the crosses between domesticated and wild individuals, more similar to the wild forms, favouring the conscious and unconscious selection against wild alleles in the domesticated environment compared to the selection against domesticated alleles in the wild environment.

The work of Bellucci et al. (2014a) also allowed the analysis of the selection signature due to domestication and the size of the genome affected. Indeed, the knowledge of the genes and the genome regions involved in the process of domestication is crucial for any successful breeding and to unravel the genetic diversity carried by the wild forms (Tanskley and McCouch 1997; McCouch 2004). Papa et al. (2005) highlighted the presence of domestication genes in genome regions of high divergence between the wild and domesticated forms, and the highest diversity of the wild common bean was observed in genome regions linked to the domestication loci, which was probably the least exploited by farmers and breeders. In another study, Papa et al. (2007) used amplified fragment length polymorphism (AFLP) markers to reveal that about 16% of the genome of the common bean was under the effects of selection due to domestication. Bellucci et al. (2014a) using simulated demographic dynamics during domestication and RNA-seq data found that about 10% of the contigs analysed were affected by selection during domestication or were physically linked to selected genes. In most cases, results showed a reduced diversity in the domesticated forms compared to the wild, as expected following positive selection due to domestication. Transcripts analysed by RNA-seq showed further reduction in the diversity of gene expression (by 26%) and a five-fold enrichment of the differentially expressed genes. Likewise, Bellucci et al. (2014a) conducted a detailed survey of the functions of the contigs that showed effects of selection due to domestication, and this will be discussed later below.

A very interesting example of diversifying selection that acts on the domesticated forms, whereby domestication increases the level of functional diversity, was observed for 2.8% of the transcripts that showed effects of selection due to domestication (Bellucci et al. 2014a). Here, no diversity was observed in the wild forms, while diversity was detected in the domesticated. Among these transcripts, the analysis of the gene functions highlighted the example of the drought-related and growth-related KUP6 (K+ uptake transporter-6) gene (Osakabe et al. 2013). KUP6 was significantly overexpressed in the domesticated form compared to the wild, as if domestication had also increased the functional diversity of the selected gene in addition to the increased nucleotide diversity. Bellucci et al. (2014a) suggested that for further crop improvements, a key aspect was not only depletion of the wild germplasm diversity, but also the diversity contained in the domesticated pool (e.g. in the traditional landraces). This was originated by the fixing of useful mutations after domestication, and it needs further consideration.

Schmutz et al. (2014) identified candidate genes that were associated with domestication by a comparison of wild and landrace populations across 10-kb/2-kb sliding windows, where they examined the empirical distribution of the diversity ratios and population differentiation statistics. They found 1835 Mesoamerican and Andean candidates with negative Tajima’s D values, which indicated positive selection. They then investigated the functions of the genes identified and applied a genome-wide association study (GWAS) approach and defined a set of genes that are linked with flowering time, leaf and seed size and seed weight (see details below).

Another important aspect related to domestication of the common bean was the identification of the geographical centres where the process took place. Kwak and Gepts (2009) proposed the Lerma–Santiago basin as the domestication site for the Mesoamerican gene pool, while Chacón et al. (2007) proposed southern Peru as the Andean domestication site. More recently, Bitocchi et al. (2013) suggested Oaxaca Valley in Mesoamerica and southern Bolivia and northern Argentina in South America as the areas of P. vulgaris domestications.

To unravel the respective roles of the Mesoamerican and Andean areas in common bean domestication, Rodriguez et al. (2016) integrated the spatial, phenotypic and molecular data with those from different disciplines, including archaeological and glotto-chronological data. For Mesoamerica, the data of Rodriguez et al. (2016) confirmed Oaxaca Valley as the putative region where domestication of the common bean took place. The genetic diversity data were supported by previous studies that detected archaeological sites in this area with common bean macroremains that were dated from 2300 to 2100 years BP (Kaplan and Lynch 1999). This area also included the homeland sites of the Zapotecan, Mixtec-Cuicatec and Popolocan protolanguages, for which ancient bean words can be reconstructed from 3149 to 3036 years BP (Brown et al. 2014).

In the Andes, Rodriguez et al. (2016) proposed the region of northern Argentina and southern Bolivia as the putative Andean domestication area, as the wild accessions from Argentina–Bolivia that were analysed were genetically more similar to the Andean domesticated forms, and showed lower 100-seed weight when compared with other Andean accessions. Their data were consistent with those from previous genetic (Beebe et al. 2001; Bitocchi et al. 2013), archaeological (Tarrago 1980) and glotto-chronological (Brown et al. 2014) studies.

The common bean is the most cultivated crop worldwide among the Phaseolus species, as after its domestication it underwent dissemination and evolution out of these American centres of origin and domestication. The pathways of distribution of P. vulgaris were complex and involved several introductions from the New World, combined with exchanges between continents, and among different countries within continents.

In the Old World of Europe, both of the common bean domesticated pools were introduced after the travels of Columbus and were then rapidly disseminated to many different European areas that were characterised by varied environmental conditions and agronomic practices. The levels of diversity for the domesticated common bean in Europe as determined using molecular markers are comparable to that observed in the Americas, without any detectable genetic bottleneck effects (Angioi et al. 2010; Gioia et al. 2013a). Moreover, due to the breakdown of the spatial isolation between the two gene pools in Europe, hybridisation and introgression occurred between the Andean and Mesoamerican gene pools, which led to the hybrid development of 40% of the European landraces (Angioi et al. 2010; Gioia et al. 2013a). Many studies have indicated that Europe was the secondary centre of diversity for the common bean (Santalla et al. 2002; Angioi et al. 2010, 2011; Gioia et al. 2013a). The high level of hybridisation that would not be expected for an autogamous species was most likely the result of selection for adaptation to these new environmental conditions. This will have exploited hybridisation and recombination between the two different gene pools to create novel genotypic combinations when compared to those of their centres of origin. Other continents and countries have also been proposed as secondary centres of diversification for the common bean, including Brazil (Burle et al. 2010), central, eastern and southern Africa (Martin and Adams 1987a, 1987b; Asfaw et al. 2009; Blair et al. 2010), and China (Zhang et al. 2008).

Recent reviews (Bellucci et al. 2014b; Bitocchi et al. 2017) afford a detailed analysis of the dissemination and evolution of the common bean and the other Phaseolus crop species outside their centres of origin.

2.4 Convergent Evolution

As mentioned above, domestication offers numerous examples of convergent phenotypic evolution that were associated with adaptation to human needs and novel agro-ecosystems. For instance, favourite plants showed adaptive features related to different climatic and environmental conditions, such as cold or drought tolerance. Most domesticated animals were selected to maximise the yield of useful products (i.e. meat, milk, wool) and for their docile behaviours, while crops were selected for the size of the plant organ used by humans (i.e. seeds, fruit) and for reduced, or lack of, seed dispersal. Indeed, during domestication, similar sets of phenotypic characteristics were selected (i.e. traits of the domestication syndrome), which has provided the opportunity to study convergent phenotypic evolution for many responses to selection pressures.

An interesting question thus arises in terms of whether convergent responses due to selection pressures under domestication are limited to the observed effects, or whether they are also related to the molecular mechanisms that control the phenotypic traits, thus acting on the same genomic regions or set of genes responsible for the same trait. Schmutz et al. (2014) were the first to investigate the convergent evolution between the two main gene pools of the common bean. By comparing wild and landrace populations across 10-kb/2-kb sliding windows and analysing the empirical distribution of the diversity statistics ratios and population differentiation statistics, they were able to compare the effects of selection that occurred within gene pools. These were subjected to independent domestications events, and therefore Schmutz et al. (2014) tried to determine whether in order to obtain the same convergent phenotypes, selection had acted on the same genomic regions, or on a completely different set of genes that coded for the same phenotype. They showed that <10% of the 74 Mb of genome sequences that were putatively involved in selection during domestication was shared between the Mesoamerican and Andean gene pools, thus suggesting different genetic routes to domestication. However, Schmutz et al. (2014) did not use explicit demographic modelling to generate an expectation of the number of potential false-positive regions. Thus, an alternative explanation of their data is that there were high levels of false positives (i.e. regions of the genome with reduced diversity due to stochastic effects of domestication bottlenecks), which would lead to a lack of shared genome regions and genes that would be predicted to be involved in domestication between the two gene pools.

To better understand this interesting phenomenon, Bitocchi et al. (2016) further investigated common bean domestication in the Mesoamerican gene pool by sequencing 49 gene fragments from a sample of 45 wild and domesticated accessions and compared the candidate genes they identified for selection during domestication with those from other studies (Bellucci et al. 2014a; Schmutz et al. 2014; Rodriguez et al. 2016). In doing this, they tried to understand whether the sexually compatible Mesoamerican and Andean lineages with similar morphologies and life cycles underwent independent selection based upon distinct sets of genes or not. They found that two genes out of the four strong Mesoamerican candidate genes identified were also detected as outliers by Schmutz et al. (2014) only during Andean domestication. This suggested that more studies and evidence are needed to understand the convergent responses due to selection pressures under domestication.

2.5 Domestication Traits

The domestication process induced several changes in the common bean plants for major traits associated with adaptation and cultivation and to address human needs. At present, the domesticated species can be clearly distinguished from their wild progenitors by a set of traits, which is known as the ‘domestication syndrome’. These changes in the domesticated individuals have guaranteed higher productivity in cultivated environments, although at the same time they have reduced the adaptation to erratic environment variations, where wild traits show much greater fitness over domesticated traits.

In recent years, many studies have been conducted in common bean through different approaches in the search of an understanding of the genetic control of these traits, among which the molecular linkage mapping approach has been widely adopted. Recently, GWAS analysis has provided a powerful tool to search quantitative trait loci (QTLs) through the use of markers such as single-nucleotide polymorphisms (SNPs), which show wide distribution across the genome. The common bean genome sequence (Schmutz et al. 2014) has also allowed establishing a link between genetic and physical maps, which thus facilitates the identification of candidate genes for domestication traits in genomic regions where significant QTLs are found. Furthermore, the availability of the entire reference genome can facilitate the identification of regions where the effects of selection are more evident, and differentiation between wild and domesticated populations is greater; these regions might contain genes that were involved in the domestication process.

There are several traits that have been linked to domestication in the common bean, which are related to vegetative growth, phenological features, size, colour and shape of the harvested parts. In particular, these include two main target traits that were selected for during the domestication process: the seed dispersal mechanism and the seed dormancy. Here, we provide a survey of the studies on common bean that have been specifically carried out on genes related to the domestication syndrome (Table 2.1), with particular focus on studies that have compared wild and domesticated forms. Studies where no direct comparisons have been made for the target traits between wild and domesticated forms were not covered in the present survey.

Table 2.1 Genes and QTLs identified as related to the domestication syndrome in the common bean

2.5.1 Seed Dispersal Mechanism

Many wild plants are characterised by seed shattering, which represents a strategy that ensures seed dispersal at maturity. Indeed, this trait is fundamental for propagation of progeny in wild individuals. Seed dispersal occurs through various mechanisms in species that have different kind of fruit.

Wild common bean is characterised by a dry strongly dehiscent legume fruit (Gepts and Debouck 1991) that opens at maturity along the ventral suture to ensure seed release. This trait has been a target of selection, and now many domesticated varieties have totally or partially lost the ability to disperse their seeds after ripening. Indeed, while dry beans have dehiscent pods, snap beans are completely indehiscent (i.e. stringless varieties; Gepts and Debouck 1991), as they no longer have the fibres in the pod sutures (i.e. string) and walls (Prakken 1934; Koinange et al. 1996) (Fig. 2.1). In the common bean, the seed dispersal mechanism is associated with the content and location of the fibres in the pods (Prakken 1934; Murgia et al. 2017), with strict positive correlation between shattering ability and increased carbon and lignin content, as was recently highlighted (Murgia et al. 2017). A QTL analysis carried out by Koinange et al. (1996) on a recombinant inbred line (RIL) population derived from a cross between Midas (an Andean domesticated accession) and G12873 (a wild Mesoamerican genotype) mapped the locus St to chromosome Pv02. This locus controls the presence or absence of pod suture fibres and co-segregates with the trait of lack of pod wall fibres (Koinange et al. 1996; Freyre et al. 1998).

Fig. 2.1
figure 1

Seed dispersal mechanism of the common bean. a From Prakken (1934): location and content of fibres, parenchymatic and wood cells in the dorsal and ventral sheets of the pod valves in stringy (dehiscent, bottom), stringless (indehiscent, above) and intermediate varieties (in the middle); b from Lamprecht (1932): schematisation of the hypothesis of Lamprecht (1932) on the genetic control of pod shattering in common bean; c domesticated pods and seeds, on the left, and twisted pods and seeds from a wild common bean, on the right

Identification of the genetic controls of seed shattering in common bean has also been carried out using candidate gene approaches that focussed on the homologues to the A. thaliana SHATTERPROOF-1 (SHP1; Nanni et al. 2011) and INDEHISCENT (IND; Gioia et al. 2013b) genes. These represent two genes that are directly involved in seed shattering and the primary factors that are required for silique shattering in this species (Liljegren et al. 2000, 2004). Nanni et al. (2011) identified, characterised and mapped a sequence (PvSHP1) in the common bean using the RIL mapping populations BAT93 × Jalo EEP558 (Freyre et al. 1998) and Midas × G12873 (MG RIL population; Koinange et al. 1996). They mapped PvSHP1 to chromosome Pv06, in proximity to the V gene that controls flower colour in the common bean (Nodari et al. 1993; McClean et al. 2002), although on a different chromosome from that of the St locus. Similarly, Gioia et al. (2013b) identified and mapped the PvIND (GenBank KC192374) sequence using the same mapping populations as Nanni et al. (2011). PvIND was located on chromosome Pv02 near to the St locus, although complete segregation between these two loci was not observed. Moreover, when Gioia et al. (2013b) examined the association between polymorphisms in the PvIND sequence and the dehiscent/indehiscent phenotype in 105 wild and domesticated lines, they did not identify any SNPs that were significantly associated with pod shattering.

Studies have indicated that a convergent phenotypic response to selection due to domestication for a specific trait might not be produced by the same molecular mechanism, also in related species (Nanni et al. 2011; Doust et al. 2004).

2.5.2 Seed Dormancy

Seed dormancy is a plant strategy that allows the delay of seed germination to avoid seedling growth under unfavourable environmental conditions. Loss of seed dormancy is considered as a main trait in the domestication syndrome in common bean and other crop species because this is crucial for cultivation (Koinange et al. 1996). The reduction of dormancy in domesticated beans has ensured more rapid and simultaneous germination. Very few studies on the seed dormancy trait have been carried out, and when Koinange et al. (1996) investigated genetic control of the domestication syndrome in the common bean, they identified four unlinked QTLs on chromosomes Pv02, Pv03 and Pv04. These four QTLs cumulatively explained 69% of the total phenotypic variation for the seed dormancy trait.

2.5.3 Growth Habit

The bush growth type is a common feature that characterises the compact growth habit of the domesticated common bean, which includes no twining branches, few vegetative nodes and long internodes. Among the traits that contribute to this growth habit, the main descriptors of the plant architecture include the climbing ability, twining, number of nodes on the main stem, number of branches on the main stem, number of pods, internode length and plant height and width. Several differences between wild and domesticated common bean can be observed, and there is also variability between the domesticated genotypes. In addition, these traits have an impact on the agriculture practices and on the yield and they can also be correlated to each other and subjected to the effects of the environment. Indeed, the length of the main stem (i.e. the plant height) is related to the numbers and lengths of the internodes, while the internode length (which also changes along the main stem) depends mainly on the environmental effects and the growth stage of the plant (Debouck et al. 1986).

The common bean plant can be either determinate or indeterminate for its growth habit, which is defined by the characteristics of the terminal part of the stem and branches (Fernández et al. 1986). The common bean shows a wide range of phenotypic variability for this determinate/indeterminate trait that can be summarised as five main types (Evans 1973; Singh 1982; Debouck et al. 1986; Fernández et al. 1986; Leakey 1988):

Type I: determinate bush growth habit that shows reproductive terminal buds both on the main stem and on the branches (Singh 1982). After flowering, apical vegetative growth stops.

Type II: indeterminate bush growth habit with a vegetative bud on the apical part of both the main stem and the branches. After flowering, new leaves and vegetative nodes are produced.

Type III: indeterminate climber growth habit that is characterised by open branches and semi-prostrate habit. In some cases, these have moderate climbing ability.

Type IV: indeterminate climber growth habit with very long branches that show strong climbing and twining.

Type V: determinate climber growth habit that shows climbing and twining.

Wild common bean is indeterminate, and selection for the more compact growth habit (e.g. the bush habit) was one of the aims of the domestication process and breeding to promote earlier flowering and maturation (e.g. determinacy) in the crops compared to the wild ancestors. The inheritance of determinacy was investigated using a linkage mapping approach and segregation analysis in the RIL population of Midas × G12873 (Koinange et al. 1996), and a single locus (fin) that controlled determinacy was identified on chromosome Pv01.

Using a candidate gene approach on two RIL of the BAT93 × Jalo EEP 558 (Freyre et al. 1998) and Midas × G12873 (Koinange et al. 1996) mapping populations, Kwak et al. (2008) mapped PvTLF1y to chromosome Pv01, which is a sequence homologous to Terminal Flower 1 (TLF1) of A. thaliana (Shannon and Meeks-Wagner 1991). This locus is responsible for the development of the terminal flower, and it acts as a repressor of flowering, with a role in the inflorescence meristem identity, and co-segregates with the fin locus (i.e. determinacy). Foucher et al. (2003) identified PsTFL1a, a pea homologue of TFL1, as the gene controlling the determinacy phenotype in pea (Pisum sativum L.). More recently, Repinsky et al. (2012) confirmed the co-segregation of PvTLF1 and fin and validated the function of PvTFL1y, whereby they confirmed it as the functional homologue of TFL1. Indeed, Repinsky et al. (2012) detected a 32–133-fold decrease in expression between the indeterminate haplotype BAT93 and the determinate haplotype CDRK, and the reduction in the expression was 20–91-fold for the determinate G00750 compared to BAT93. This finding confirmed the function of PvTFL1y as a flowering repressor, as it prevents the transition of the vegetative apical bud into a terminal flower. PvTFL1y sequences for some important reference lines of the common bean have been deposited with GenBank (Repinsky et al. 2012).

Moghaddam et al. (2016) used GWAS analysis on a panel of 280 domesticated common bean genotypes and confirmed the co-localisation on chromosome Pv01 of PvTFL1y (Phvul.001G189200) and the fin locus (determinacy) (Koinange et al. 1996; Kwak et al. 2008; Repinsky et al. 2012). When performing GWAS analysis with the exclusion of the determinate genotypes, they also detected significant associations with other QTLs on chromosomes Pv04, Pv06, Pv07 and Pv11, which were not detected using the entire collection.

Among the five growth habit classes, three include common beans that can climb and also have erect growth if supports are provided. In a traditional intercropping system known as ‘milpa’, the common bean is also cultivated in association with maize (Zea mays L.), which provides the support, and also maybe including squash (Cucurbita spp.). Moreover, climbing is related to twining, another important feature that characterises the main stem of the common bean.

For the climbing ability, using the RIL population G2333 × G19839, Checa and Blair (2008) identified seven QTLs. G2333 is an indeterminate climbing (Type IV) Mesoamerican landrace, while G19839 is an indeterminate bush (Type II) Andean landrace. Among these seven QTLs, one was located on chromosome Pv04 (Cab 1-1) in two field trials at 45 days after planting, in the same genomic region in which a further QTL (Cab 2-1) was associated with the climbing ability at two sowing locations and at 75 days after planting. The other five QTLs were mapped by Checa and Blair (2008) to chromosomes Pv04 (Cab 1-2), in the same genomic region where QTLs for plant height and internode length were also located, Pv05 (Cab 1-3), Pv07 (Cab 1-4), near the Phs locus (i.e. the phaseolin gene), Pv10 (Cab 1-5), and Pv11 (Cab 1-6).

In their investigations into the genetic control of the twining predisposition using the Midas (non-twining) × G12873 (twining) RIL population, Koinange et al. (1996) mapped this trait (Tor) in the same region as fin (i.e. chromosome Pv01), the locus for determinacy. As a possible explanation, they proposed that the fin locus might have a pleiotropic effect on both determinacy and twining, or that the genes responsible for these two traits are strictly associated in the MG RIL population.

Among the growth habit traits, the number of nodes and the number of branches on the main stem are considered to be reliable descriptors for the common bean phenotypic architecture. Domestication caused a reduction in the number of nodes in the main stem, and for this trait, three QTLs were defined by Koinange et al. (1996). One of these, on chromosome Pv01, was linked to the fin locus for determinacy in the same genomic region where QTLs related to earliness and number of pods were mapped. Similarly, for the two other QTLs, which were mapped in the same regions where QTLs for earliness traits, yield components and plant architecture were identified, one was located on chromosome Pv01, tightly linked to the Ppd locus for the photoperiod sensitivity, and the other on chromosome Pv08.

Tar’an et al. (2002) studied 142 F2:4 individuals derived from a cross between two inbred lines: OAC Seaforth (determinate bush; Type I) and OAC 95-4 (indeterminate bush; Type II). They mapped one QTL for the number of nodes of the main stem on chromosome Pv10. They also identified a QTL on chromosome Pv04, for the number of branches on the main stem, as did Checa and Blair (2008), who mapped a QTL for the number of branches (Brn1) in the same region. Moreover, Brn1 was located close to other QTLs for plant architecture traits, such as climbing ability, internode length and plant height (Checa and Blair 2008).

For the common bean, the domesticated growth habit is generally characterised by a lower number of pods per plant, on the main stem and branches. The parental lines of the MG RIL population (Koinange et al. 1996) are representative for this trait. Indeed, the domesticated parent Midas produced a mean of 13.9 pods per plant, while the wild parent G12873 gave 43.2 pods per plant. Using the molecular linkage mapping approach for the number of pods, six QTLs were mapped in two different RIL populations derived from a cross between wild and domesticated parental lines (Koinange et al. 1996; Blair et al. 2006, using a backcross BC2F3:5 population derived from a cross between ICA Cerinza, cultivated, Type I and G24404, wild, Type IV). Two QTLs were mapped, on chromosome Pv01 linked to the fin locus, and on chromosome Pv08 in the same region where QTLs for the number of nodes on the main stem and the earliness traits were identified (Koinange et al. 1996). The other four QTLs were mapped on Pv04 (D14 on the Midas × G12873 linkage map; Koinange et al. 1996) and on chromosomes Pv07 (Pp7.2), Pv09 (Pp9.2) and Pv11 (Pp11.3) (Blair et al. 2006). Using the same approach on domesticated individuals, Tar’an et al. (2002) identified a further QTL for the number of pods (PPP) on chromosome Pv04, in a region where QTLs for numbers of branches (Tar’an et al. 2002; Checa and Blair 2008), climbing ability, plant height and internode length (Checa and Blair 2008) have been identified. More recently, Kamfwa et al. (2015) using a GWAS approach with the Illumina BARCBean6K_3 BeadChip genotyped 237 Andean domesticated individuals (i.e. varieties, elite lines, landraces). They searched for genomic regions that were associated with important agronomic traits that are also related to the domestication process, and they identified two significant QTLs for the number of pods on chromosomes Pv05 and Pv07.

As indicated above, internode length is a growth habit trait that can show high phenotypic variability due to both the different growth stage of the plant and environmental effects. In general, domestication selected individuals with lower numbers of vegetative nodes on the main stem, although with longer internodes, which correspond to the stem portion between neighbouring nodes. Using the molecular linkage mapping approach for the internode length, a QTL (L5) was mapped to chromosome Pv01 by Koinange et al. (1996), while Checa and Blair (2008) worked on the G2333 × G19839 RIL population and mapped four QTLs, one on chromosome Pv03 (Int1) and three on chromosome Pv04 (Int2, Int3, Int4). Interestingly, most of the QTLs associated with plant architecture were mapped to chromosome Pv01 by Koinange et al. (1996), Checa and Blair (2008), using domesticated individuals, observed the co-localisation of QTLs for plant architecture traits on chromosome Pv04. To dissect out the plant architecture components, Blair et al. (2006) searched for QTLs for plant height identifying four QTLs, one on chromosome Pv01, two on chromosome Pv06 and one on chromosome Pv07. Working on a RIL population derived from a cross between wild and domesticated lines, Blair et al. (2006) also mapped three QTLs for plant width to chromosomes Pv06 (two QTLs) and Pv07. Interestingly, the QTL for plant height on chromosome Pv07 (ph7.1) was close to the Ph (phaseolin) locus, and it fell in the same region where Checa and Blair (2008) mapped the QTL Cab 1-4 for climbing ability. Using the same approaches on domesticated materials, a QTL for plant height was mapped to chromosome Pv07 by Tar’an et al. (2002); this QTL mapped near to PvTFL1z (Kwak et al. 2008), another homologue of Terminal flower1 that controls the indeterminate phenotype in A. thaliana.

2.5.4 Gigantism

During the domestication process, humans propagated the individuals that showed appreciable features for consumption, and in general the selection was related to all of the usable parts of the plants, such as the edible parts, which can differ on different species. In legumes such as the common bean, artificial selection favoured large pods and seeds. Indeed, one of the most obvious differences between wild and domesticated beans is the different of shapes, sizes and weights of the fruits, which together constitute the trait known as ‘gigantism’.

Pod length and pod weight are two traits that are considerably important in terms of yield. Three QTLs related to pod length were mapped by Koinange et al. (1996) using linkage mapping approaches, on chromosomes Pv02, Pv07 and Pv11 (as D1b, in Freyre et al. 1998). The QTL on chromosome Pv11 was the most significant, as this explained 23% of the total phenotypic variance, while the three QTLs together explained 37% of the total phenotypic variance for pod length.

More recently, three QTLs that mapped on chromosome Pv08 were associated with pod weight, using GWAS analysis on Andean domesticated individuals (Kamfwa et al. 2015). The SNP showing the highest association with pod weight (i.e. ss715639408; P = 4.3 × 10−8; position 5150618) was also associated with plant biomass (as g/plant). Indeed, the trait of pod weight contributes to the biomass, and the significant genetic association between these two traits might depend on the high phenotypic correlation between them (r = 0.87; P = 0.001; Kamfwa et al. 2015). Moreover, and of particular interest, two significant SNPs on chromosome Pv08 (ss715639408, ss715649359) were common to the traits of pod weight and yield/plant (as g seed/plant). Thus, Kamfwa et al. (2015) suggested that the same gene might have pleiotropic effects on these two traits, or that two distinct genes that reside on the same linkage disequilibrium (LD) block are associated with the same SNP.

For seeds traits, 100-seed weight has the main role for gigantism features, and together with seed length, seed height and seed width, characterises the domestication process in common bean, mainly in terms of the changes in seed shape and size. One-hundred seed weight is one of the main traits that affect the yield, and there is a wide variability for this trait among the domesticated individuals. Many efforts have been made to understand the genetic control of this quantitative trait. Four significant QTLs were mapped on chromosomes Pv01, Pv07 and Pv11 on the Midas × G12873 molecular linkage map (Koinange et al. 1996). The QTLs with the highest significant effects were those on chromosomes Pv01 and Pv07 (P = 0.001), in proximity to the Phs locus for the phaseolin gene, explaining 18% and 27% of the total observed phenotypic variance, respectively. Using similar QTL mapping approaches, other studies identified several QTLs for 100-seed weight on many of the common bean chromosomes (i.e. Pv02, Pv03, Pv04, Pv06, Pv07, Pv08, Pv09, Pv10, Pv11), both when wild and domesticated (Blair et al. 2006) or only domesticated (Tar’an et al. 2002; Pérez-Vega et al. 2010) accessions were compared. One of the QTLs on chromosome Pv07 (sw7.1) was linked to the phaseolin locus (Blair et al. 2006), in agreement to previous observations (Koinange et al. 1996). A QTL (SW6) identified on chromosome Pv06 (Pérez-Vega et al. 2010) mapped near QTLs for seed length (SL6) and seed height (SH6), in the same location as QTL SW6.1 (Blair et al. 2006). Moreover, Pérez-Vega et al. (2010) identified additional QTLs for seed weight (SW8.1 and SW8.2) on chromosome Pv08, where also QTL for seed height (SH8) and seed length (SL8) were detected. Cumulatively, these three QTLs for seed weight explained 54% of the total phenotypic variance, with high correlation both between 100-seed weight and seed length (r = 0.83; P < 0.05), and between seed weight and seed height (r = 0.74; P < 0.05). Further, QTLs for seed size-related traits (i.e. length, width) were also identified on chromosomes Pv02, Pv03, Pv06, Pv07 and Pv10.

More recently, Schmutz et al. (2014) performed diversity and population differentiation analyses on a set of wild and landrace individuals, and they identified different genomic regions as putatively under selection (i.e. 1835 Mesoamerican, 748 Andean physically mapped candidate genes for the domestication process). Among these genes, they highlighted Phvul.008G168000 (Mesoamerican candidate gene) that encodes nitrate reductase, an enzyme that has an important role in plant nitrogen assimilation and seed growth. Interestingly, this gene was physically mapped to chromosome Pv08, near SW8.2, the QTL that was previously associated with 100-seed weight (Pérez-Vega et al. 2010).

To validate the Mesoamerican candidate genes that were under selection and to discover the genetic architecture of the seed weight trait, Schmutz et al. (2014) further examined their candidate genes for seed weight using a GWAS approach on a set of 271 modern common bean varieties from the Mesoamerican gene pool. Three genes were confirmed by GWAS among the 15 candidate genes previously shown putatively under selection in relation to seed weight. The GWAS approach also placed several domestication candidates related to seed weight, which showed extensive linkage disequilibrium, on one sweep window on chromosome Pv07 (Schmutz et al. 2014).

These data show that searching for candidate genes in genomic regions under selection can be a reliable approach, and GWAS analysis represents a powerful tool for validation of candidate genes, especially when the possibility to compare genetic and physical maps can be exploited.

Bellucci et al. (2014a) used RNA-seq techniques and identified 2364 transcripts (representing ~9% of the total transcriptome) that showed signatures of selection between wild and domesticated Mesoamerican accessions. They focussed on the transcripts with higher selection index and analysed the functions of these genes while searching for homologies with genes related to the domestication process in different species. An interesting example here was related to fruit size (Bellucci et al. 2014a), a contig homolog of YABBY5 (YAB5) that was putatively under selection. YABBY5 is a transcription factor implicated in the regulation of seed shattering in cereal species, including sorghum (Sorghum bicolor), rice and maize (Lin et al. 2012). A YAB-like transcription factor (FASCIATED), which increases the number of locules, was also associated with the control of carpel number and fruit development in tomato (Cong et al. 2008). Indeed, the process that leads to larger fruit in the domesticated races consists of two main aspects: increased cell division and a greater number of organs in the fruit (Fig. 2.2).

Fig. 2.2
figure 2

FASCIATED (fas): an example of a domestication gene related to gigantism, a homologue of a contig that is putatively under selection that was identified by Bellucci et al. (2014a). Locule number evolution during tomato domestication, from S. pimpinellifolium (two locules), which is considered as the wild ancestor of tomato, to the cultivated S. lycopersicum (as described in Muños et al. 2011)

2.5.5 Earliness

A common feature among crops is that they flower and mature earlier than in the wild, which represent a key factor in domestication, also considering that simultaneous flowering can guarantee simultaneous maturation and harvesting. The number of days to flowering is a trait that describes the number of days between planting and flowering, while the days to maturity corresponds to the days needed for the maturation of the pods. Therefore, different studies have been dedicated to these phenological traits. Ten QTLs associated with days to flowering were detected across chromosomes Pv01 (Koinange et al. 1996; Blair et al. 2006), Pv02, Pv06, Pv09, Pv11 (Blair et al. 2006) and Pv08 (Koinange et al. 1996), using a molecular linkage mapping approach with populations derived from a cross between wild and domesticated genotypes. Interestingly, one QTL on chromosome Pv01 (Koinange et al. 1996) explained 38% of the total phenotypic variance for flowering time. At the same time, it was close to fin, a gene for determinacy, and co-localised with QTLs for the number of nodes on the main stem, the number of pods and the days to maturity. Likewise, other QTLs were detected on the same chromosomes in populations derived from a cross between domesticated individuals; Pérez-Vega et al. (2010) mapped indeed three QTLs on chromosomes Pv01, Pv02 and Pv08. Interestingly, in this population, the QTL on chromosome Pv01 was the closest to the fin locus, confirming the observation of Koinange et al. (1996), both regarding the QTL position and the co-localisation of genes for determinacy and phenology traits.

Further interesting examples of co-localisation between genes for earliness and other domestication traits were reported by Koinange et al. (1996), who mapped the number of days to flowering and the sensitivity to photoperiod (Ppd locus) in the same genomic region on chromosome Pv01. Moreover, Blair et al. (2006) mapped two QTLs for days to flowering to chromosome Pv06, close to V, a flower colour locus (Nodari et al. 1993; McClean et al. 2002), and PvSHP1, a homologous gene to Shatterproof, which controls seed shattering in A. thaliana (Nanni et al. 2011). However, the most significant QTLs reported by Blair et al. (2006) were those on chromosome Pv09 (df9.1, df9.2), which suggested a parallelism with the observation of Tar’an et al. (2002). Indeed, using a population that was derived from a cross between two domesticated lines, Tar’an et al. (2002) identified a significant QTL for days to flowering on chromosome Pv09, close to the growth habit locus (GH) that encodes for the determinate/indeterminate phenotype in this population. They thus provided evidence of the co-localisation of genes for determinacy and phenology traits, although on a different chromosome to Koinange et al. (1996) and Pérez-Vega et al. (2010) who co-mapped days to flowering and determinacy on chromosome Pv01. Through the GWAS approach and by analysing domesticated genotypes, significant QTLs for days to flowering were detected on chromosome Pv01 by Kamfwa et al. (2015) and Moghaddam et al. (2016), which confirmed the observations of Koinange et al. (1996), Blair et al. (2006) and Pérez-Vega et al. (2010). Other QTLs have also been found on chromosome Pv08 (Kamfwa et al. 2015), as previously reported (Koinange et al. 1996; Pérez-Vega et al. 2010). Considering the days to maturity, two QTLs were reported on chromosome Pv01 (Koinange et al. 1996), one on each of chromosomes Pv05 and Pv07 (Blair et al. 2006) and one on chromosome Pv08 (Koinange et al. 1996).

The co-localisation observed between QTLs for days to flowering and days to maturity on chromosomes Pv01 and Pv08 by Koinange et al. (1996) is of particular interest. One QTL on chromosome Pv01 (Koinange et al. 1996) is mapped close to the fin locus for determinacy, and it explained 30% of the total phenotypic variance. In contrast, Blair et al. (2006) did not find QTLs on chromosome Pv01 and identified QTLs for days to maturity on different chromosomes than those reported for days to flowering.

Considering studies conducted only on domesticated genotypes, further QTLs were found on chromosomes Pv01, Pv02 and Pv06 (Pérez-Vega et al. 2010), and on chromosomes Pv09 and Pv10 (Tar’an et al. 2002). Likewise, in the study by Pérez-Vega et al. (2010), days to flowering and days to maturity co-mapped on chromosome Pv01, close to the fin locus, and on chromosome Pv02. This was also observed by Tar’an et al. (2002) on chromosome Pv09, which suggested that neighbouring genes control these two traits, or that pleiotropic effects might be involved in the genetic control of days to flowering and days to maturity.

Recent results provided by GWAS analysis confirmed a significant QTL for days to maturity on chromosome Pv01 (Kamfwa et al. 2015). In this case, the significant SNP reported for days to maturity (ss715646578; SNP position 48340819) was also significant for days to flowering, which confirmed that these two traits co-map in populations with different genetic backgrounds. In addition to the genetic association between these two traits, significant phenotypic correlation was detected between days to flowering and days to maturity in different populations (r = 0.64, Tar’an et al. 2002; r = 0.67, Pérez-Vega et al. 2010; r = 0.70, Kamfwa et al. 2015). Bellucci et al. (2014a) used RNA-seq on a set of wild and domesticated accessions, and among the transcripts that showed selection signatures, they identified a homologue of the Vernalisation genes of Arabidosis, which are involved in the vernalisation pathway to promote flowering in A. thaliana. These genes have a crucial function; indeed, a long cold period can be needed to promote flowering in many species, and some plants bloom only after winter. The Arabidopsis Vernalisation genes act by repression of FLC, which is a floral repressor, and after a long period of cold, FLC mRNA decreases due to the increased expression of genes such as VRN1, which thus indirectly controls the flowering time. Moreover, Schmutz et al. (2014) reported several genomic regions that were associated with the domestication process, and among these, they identified homologous genes for VRN1 (Phvul.003G033400) on chromosome Pv03, and VRN2 (Phvul.002G000500) on chromosome Pv02, as a Mesoamerican candidate gene.

Also, in this case, these studies on earliness demonstrate that searching for genes associated with domestication and looking for selection signals across the genome between wild and domesticated individuals appears to be a very promising approach.

2.5.6 Photoperiod Sensitivity

The response to photoperiod is a trait that is strictly dependent on the environment in which the species originated, evolved and adapted, with the regulation of flowering time with respect to day length. Long-day plants bloom when the length of the day tends to increase, with >12 h of daylight, while short-day species flower when the length of the day is <12 h. Based on this sensitivity to the photoperiod, a species cannot be cultivated at all latitudes, unless it is day-neutral or indifferent to the photoperiod. While domesticated individuals have become insensitive to the photoperiod, individuals introduced into areas to which they are not adapted bloom later or do not bloom at all. In common bean, this trait was measured as the delay in flowering under a day length of 16 h, compared to 12 h. This descriptor is appropriate to detect the delay in flowering in wild individuals that flower only under short days, compared to domesticated individuals in which no delay in flowering has been highlighted under long days, as observed by Koinange et al. (1996). A molecular linkage mapping approach to the RIL population of Midas × G12873 detected two QTLs for photoperiod sensitivity, on chromosomes Pv01 and Pv11 (Koinange et al. 1996). The QTL on chromosome Pv01 mapped to the same region as number of nodes on the main stem, 100-seed weight, days to flowering and days to maturity (Koinange et al. 1996). These corresponded to the Ppd locus for photoperiod sensitivity (Wallace et al. 1993) and were linked to the fin locus for determinacy (Koinange et al. 1996), and they explained 44% of the total phenotypic variance. The QTL mapped to chromosome Pv11 was close to the marker D1479 and explained 17% of the total phenotypic variance.

More recently, Bellucci et al. (2014a) used RNA-seq technology for a genome-wide analysis and reported several candidate genes related to the photoperiod response. These genes appeared to be related to domestication because they specifically investigated the function of the genes putatively under selection during domestication. Among these, a homologous sequence to GIGANTEA (GI) was found to be under selection. In A. thaliana, this gene has an important role in flowering and in the regulation of other genes, such as CONSTANS (CO) and FLOWERING TIME (FT), to induce flowering under long days. Interestingly, CO and FT (the target genes of GI; Fig. 2.3) were reported as targets of selection during domestication in rice and sunflower (Blackman et al. 2011; Takahashi and Shimamoto 2011; Wu et al. 2013), and FT is the target gene of the Floral repressor FLC, which in turn is repressed by the Arabidopsis Vernalisation genes, which were also found as under selection in Bellucci et al. (2014a).

Fig. 2.3
figure 3

GIGANTEA: an example of a domestication gene related to photoperiod sensitivity, reported by Bellucci et al. (2014a) as among the genes with high selection coefficient. GIGANTEA (GI; red) acts upstream of the CO (CONSTANS) and FT (FLOWERING LOCUS T) genes, as its rice ortolog Hd1, in the flowering pathway, to induce flowering in under long-day (Arabidopsis) and short-day (Oryza sativa) conditions. In addition, CO controls the responses of the downstream genes, AP1 (APETALA 1) in Arabidopsis and Hd3a in rice (an Arabidopsis FT homologue). Also, a homologue of AP1, Vrn1 (red asterisk) was found among the genes under selection in Bellucci et al. (2014a). The numbers indicate homologous genes that have been reported as targets of selection in other crop species: 1. rapeseed BnFLC.A10; 2. wheat Vrn1; 3. wheat Vrn2; 4. lentil SN; 5. maize ZmCCT; 6. pea HR; 7. sunflower HaFT1

2.5.7 Productivity

Artificial selection led towards a reduction in the number of pods and seeds in common bean, although without reducing yield, which increased in the cultivated plants. Among the domestication syndrome traits, the harvest index is a widely used descriptor to quantify plant productivity, as the measure of the ratio between seed yield and plant biomass. However, yield depends on several factors, like the size and number of seeds per plant, and it can be measured as yield/plant and yield/surface area. Using the molecular linkage mapping approach, two significant QTLs for harvest index were found on chromosomes Pv01 and Pv08, in the same region where QTLs for 100-seed weight (Pv01) and number of pods (Pv08) were detected (Koinange et al. 1996). Working on domesticated individuals, Tar’an et al. (2002) used a linkage mapping approach to map one QTL on chromosome Pv06, in contrast to Kamfwa et al. (2015), who identified two significant SNPs on chromosome Pv03 using a GWAS approach. The different genetic structures of the populations used in these studies might explain these divergent data, which confirms that yield has wide variability between wild and domesticated individuals and among domesticated varieties.

When dissecting yield into its component parts, other QTLs were detected in different studies: three QTLs were found for seeds per plant, one on chromosome Pv06 and two on chromosome Pv07 (Blair et al. 2006), while Kamfwa et al. (2015) used a GWAS approach to identify two QTLs, on chromosomes Pv03 and Pv05. For the trait of seed yield (kg/ha), nine QTLs were detected, one on chromosome Pv02, two on chromosome Pv03, four on chromosome Pv04 and two on chromosome Pv09 (Blair et al. 2006). Similarly, Kamfwa et al. (2015) used a GWAS approach to identify two QTLs for seed yield, on chromosomes Pv03 and Pv09, and Tar’an et al. (2002) mapped three QTLs for the same trait, one of which was on chromosome Pv09. Moreover, Kamfwa et al. (2015) detected a significant SNP for the trait of yield per plant on chromosome Pv09, and two QTLs on chromosome Pv08, one of which (ss715639408; position Ch8:5150618) was also significant for pod weight and plant biomass. Using different approaches and populations with different genetic backgrounds, the major components of yield have therefore been identified on chromosomes Pv01 (Koinange et al. 1996), Pv03 (Blair et al. 2006; Kamfwa et al. 2015), Pv08 (Koinange et al. 1996; Kamfwa et al. 2015) and Pv09 (Tar’an et al. 2002; Blair et al. 2006; Kamfwa et al. 2015). The pod harvest index is a further component of yield, which is measured as the ratio between seed weight and weight of the fertile pods. This was mapped by Kamfwa et al. (2015) to chromosome Pv04 using a GWAS approach on a set of domesticated accessions.

From these studies, it has emerged that the increased productivity observed in the domesticated individuals compared with their wild progenitors is due to several traits. These traits might in turn be under the control of multiple genes, which suggest both a relationship between the yield components and complex genetic control for the harvest index.

2.5.8 Seed and Pod Pigmentation

Although the selection process is generally correlated with a reduction in genetic diversity at target loci going from the wild to the domesticated individuals, domestication has led to an increase in the phenotypic variability for some domestication-related traits (Bellucci et al. 2014a). In the case of the domestication of beans, human selection has led to seeds and pods with a wide range of colours and colour patterns. This process has increased the diversification between the wild and domesticated individuals, and it has also contributed to the diversification and increased variability within the domesticated forms. As an example, in the commercial varieties, the two domesticated lines from which Pérez-Vega et al. (2010) developed their mapping population, Xana and Cornell 49242, showed marked differences for seed traits. Xana is described as a white and large-seeded line, while Cornell 49242 has small and black seeds. Also Midas and G12873, the domesticated and wild parental lines of the MG RIL population used by Koinange et al. (1996), are very different in terms of seed colours. Midas is a domesticated snap bean that has white seeds, while G12873 is a wild Mesoamerican accession that has pigmented seeds, with an agouti colour.

A locus for seed colour, P, was identified on chromosome Pv07 in proximity to the Phs locus (phaseolin) and in a region near to the QTL for 100-seed weight (Koinange et al. 1996). The y locus for pod colour (i.e. green vs yellow) was also mapped to chromosome Pv02. These traits were treated as qualitative, with the hypothesis that the seed and fruit colours are both controlled by a single gene. Later, McClean et al. (2002) also investigated the genetic basis of the colour patterns for the seeds of the common bean, and they phenotypically mapped several loci that had been previously identified as associated with seed colour. They also developed different molecular markers (i.e. RAPD, STS) that were associated with these genes and mapped them in the core linkage map (Freyre et al. 1998). The loci G, V, C and Gy for seed colour were located to chromosomes Pv04, Pv06 and Pv08, while for the seed coat pattern they mapped T, Bip, Ana, J and Z to chromosomes Pv03, Pv09 and Pv10. All of these genes interacted with each other for the determination of the wide range of colouration and colour pattern of the seeds, while the gene P controlled the absence or presence of the pigmentation. In more detail, a dominant allele at the P locus (PP, P-) determined the presence of colour in the flowers and seeds (Emerson 1909), while the recessive genotype, pp, results in white flowers and seeds, as for the domesticated accession ‘Midas’ in Koinange et al. (1996).

2.6 Conclusions

With the release of the reference genomes of both the Mesoamerican and Andean genotypes, a new era of genetic and genomics studies has begun for the common bean and the other Phaseolus spp. More insight into the common bean genomics can be achieved by exploiting the reference genome sequences and the derived new tools to focus on the major phenotypic changes that occurred during domestication and the successive episodes of improvement, including modern plant breeding. Identification of the molecular basis of the domestication syndrome would also be a major step towards our understanding of the evolutionary processes and provide a useful lesson to improve the breeding of novel varieties. Along with the other Phaseolus spp., the common bean remains an ideal model to study the molecular implications of the convergent phenotypic evolution that occurred under domestication due to the multiple independent domestication events between and within species that occurred for Phaseolus spp. Similarly, analysis of the evolution after domestication, in terms of the introduction of the common bean into Europe (BEAN_ADAPT project, www.beanadapt.org), continues to offer novel opportunities to dissect out the genetic architecture of environmental adaptation in crop species.