1 Introduction

Maize (Zea mays L.) is currently produced on nearly 100 million hectares in 125 developing countries and is among the three most widely grown crops in 75 of those countries (FAOSTAT 2010). Between now and 2050, the demand for maize in the developing world will double, and by 2025, maize production is expected to be highest globally, especially in the developing countries (Rosegrant et al. 2009). Yet, maize yields in many developing countries are severely limited by an array of abiotic and biotic stresses, besides other factors. Production may not be able to meet out the demands without strong technological and policy interventions (Shiferaw et al. 2011). Uncontrolled area expansion cannot be a solution for this, as this could potentially threaten the fragile natural resources, including forests and hill slopes in the developing world.

Another important challenge that threatens the long-term production growth of maize is the changing global climate (Cairns et al. 2012). Climate change scenarios show agriculture production will largely be negatively affected and will impede the ability of many regions to achieve the necessary gains for future food security (Lobell et al. 2008). The diversity of several important crops, including maize, spread across the world is threatened by rapid urbanization and habitat erosion as well as by the unpredictable and extreme climatic events, including increasing frequency of drought, heat and flooding. Concerted and intensive efforts are required to develop climate-change-resilient maize cultivars while accelerating the yield growth, without which the outcome will be hunger and food insecurity for millions of poor consumers of maize.

Maize has enormous genetic diversity that offers incredible opportunities for genetic enhancement despite the challenges mentioned above. There is no lack of favourable alleles in the global maize germplasm that contribute to higher yield, abiotic stress tolerance, disease resistance or nutritional quality improvement. However, these desirable alleles are often scattered over a wide array of landraces or populations. Our ability to broaden the genetic base of maize and to breed climate-resilient and high-yielding cultivars adaptable to diverse agro-ecologies where maize is grown will undoubtedly depend on the efficient and rapid discovery and introgression of novel/favourable alleles and haplotypes. The purpose of this article is to highlight the enormous genetic diversity in maize, especially in the landraces and the wild relative, teosinte, and the need for novel and systematic initiatives to understand and utilize the genetic diversity.

2 Maize landraces: From Mexico to the world over

Maize (Zea mays ssp. mays) was domesticated from its wild species ancestor, teosinte (Zea mays ssp. parviglumis), about 9000 years ago. This domestication event took place in the mid-elevations (~1500 m above sea level) of South Central Mexico, and occurred once starting with the teosinte race Balsas (Matsuoka et al. 2002). Maize then followed a very complicated pattern of introduction to different continents, including the North and South Americas, Europe, Africa and Asia (Rebourg et al. 2003; Dubreuil et al. 2006; Marilyn Warburton, personal communication). Most of such introductions happened several centuries ago, and maize landraces with better adaptability have been selected by the farmers to the new environments, leading to several new derivatives in the process. For example, maize was introduced in Africa nearly five centuries ago (McCann 2005). Since then, the crop expanded in its range from the lowlands to the highlands, and has become the number one crop in the continent in terms of cultivated area and total grain production (FAOSTAT 2010).

A ‘landrace’ may be defined as ‘…a dynamic population(s) of a cultivated plant that has historical origin, distinct identity, and lacks formal crop improvement, as well as often being genetically diverse, locally adapted, and associated with traditional farming systems’ (Camacho-Villa et al. 2005). The maize landraces are usually genetically heterogeneous populations (each such population comprising a mixture of genotypes), and are typically selected by farmers for better adaptation to specific environment, prolificacy, flowering behaviour, yield, nutritive value and resistance to biotic and abiotic stresses. A maize landrace is mostly defined by the farmer in terms of ear characteristics; the ear type is usually maintained by the farmers through conservative selection in spite of considerable gene flow (Louette et al. 1997; Louette and Smale 2000). In addition to farmer’s management of maize landraces (e.g. sample size, selection decisions), the biology of the species (e.g. cross-pollination in case of maize) also plays a major role in structuring the maize populations (Pressoir and Berthaud 2004). Mutations could introduce novel variation; for example, Tuxpeño Sequía and Tuxpeño Crema are two different sub-populations derived from a Mexican landrace (Tuxpeño), with significant variation in maturity and other agronomic traits. Similarly, Olotillo Amarillo and Olotillo Blanco are two different versions (with yellow and white kernel colour, respectively) of the same landrace, with a mutation in the gene Y1 (= Psy1) that codes for phytoene synthase. Similarly, genetic drift could affect neutral allele frequencies, especially in small populations, as revealed by an analysis of maize landraces in the central valleys of Oaxaca province in Mexico (Pressoir and Berthaud 2004).

3 Phenotypic variability and agronomic value of maize landraces

Several unique maize landraces are prevalent in different regions of world, but particularly noteworthy are the landraces still grown by the farmers in Mexico; most notable among these regions are Chiapas, Chihuahua, Durango, Guanajuato, Guerrero, Jalisco, Oaxaca and Puebla (Wellhausen et al. 1952). Some of the prominent examples of the landraces that hold great relevance to the maize farmers as well as to the scientific community are depicted in figure 1, and described below.

Figure 1
figure 1

Diversity in some Mexican maize landraces conserved in the CIMMYT Gene Bank (Courtesy: Genetics Resources Program, CIMMYT).

The Tuxpeño maize, domesticated in the Oaxaca-Chiapas region (Kato 1988), is a highly productive lowland race, that is well suited to fertile soils, and has been widely used in maize improvement programmes. Tuxpeño Sequía is an early maturing and a drought-tolerant sub-population of Tuxpeño landrace. Tuxpeño crema is a different sub-population of Tuxpeño. Though relatively late maturing, it is resistant to tropical foliar diseases, has white kernels, excellent stalk strength and relatively short plant stature (Rodriguez et al. 1998).

Bolita, a landrace with drought tolerance and good tortilla making properties, is considered to have originated from the Tehuacan valley of Puebla. Olotillo is the most important local race in the Central Depression of Chiapas and shows good performance on poor or unfertilized soils (Benz 1987). Two varieties of this race are cultivated according to the kernel colour – Olotillo Amarillo and Olotillo Blanco, with yellow and white kernels, respectively.

The town of Jala, in the state of Nayarit in Mexico, has been traditionally known for its unique, giant maize landrace, called ‘Jala’ (Kempton 1924). Jala is extremely tall (up to 5 m) and bears very long ears (up to 45 cm). Jala has been the target of a promotional campaign to promote on-farm conservation (Listman and Estrada 1992). Its August Feria de Elote (Corn-on-the-cob Festival) is well attended and well known for its giant ears of corn-on-the-cob. The unique alleles in the Jala populations have been consistently maintained since several decades, both in the genebank and in the farmers’ fields, although farmers in Jala today plant much smaller areas of the variety Jala than in the past (Rice 2004).

The Chalqueño landrace is prevalent in the regions with better rainfall and longer seasons and is considered to be high yielder, while Nal-tel is a widely distributed race in Chiapas that is particularly characterized by a short growing cycle (Ortega-Paczka 1973). Palomero Toluqueño, another prominent popcorn landrace, is well-adapted to high elevations and low temperature, and was found to have resistance to the maize weevil, Sitophilus zeamais (Arnason et al. 1994).

Mexican maize landraces with abiotic stress tolerance include: La Posta Sequia, Cónica, Cónica Norteña, Bolita, Breve de Padilla, Nal Tel, Tuxpeno (drought tolerant), Oloton (acid soil tolerant) and Chalqueño × Ancho de Tehuacán cross (alkalinity tolerant). Landraces that are particularly preferred for their tortilla quality are Pepitilla, Bolita, Azul, Tlacoya and Oaxaqueno. Landraces that are well-known for their high-altitude adaptation are Palomero Toluqueño, Cónica, Cacahuacintle and Arrocillo.

It is important to note that maize landraces with some unique characteristics also exist outside Latin America. For example, Sikkim is the bedrock of maize diversity in India, with a unique collection of landraces that are still conserved and utilized by the farmers for diverse purposes (Prasanna and Sharma 2005; Prasanna 2010). These include Murli makai (Sikkim Primitive), Kaali makai with dark purplish black kernel type; Rathi makai with dark red kernels; Paheli makai with yellow/orange flint kernel type; Seti makai with white kernel type; Putali makai with transposon-induced pericarp variegation; Chaptey makai with white, dent type kernels; Gadbade makai with a mix of white and purple flint kernels; Bancharey makai, a high altitude maize with yellow, flint kernel type; Kukharey makai with short-statured plants;, Kuchungdari with orange colored popcorn type kernels; and Kuchungtakmar with a mix of yellow, white, purple and red kernels (figure 2). These landraces were collected from Sikkim by the author under the ICAR National Fellow Project in 2005, and characterized at both phenotypic and molecular levels (Prasanna 2010; Singode and Prasanna 2010; Sharma et al. 2010) (figure 3). Of particular significance are the landraces with primitive characteristics (popcorn characters and high prolificacy). Dhawan (1964) christened such landraces as ‘Sikkim Primitives’, whose New World progenitors seem to have disappeared. The most important attributes of the ‘Sikkim Primitive’ maize are prolificacy (5–9 ears on a single stalk) with lack of apical dominance, tall with drooping tassels and uniformity in ear size and popcorn type kernels (Dhawan 1964; Singh 1977). This landrace, locally known as Murli makai, stays green after maturity, and thus serves well for fodder purpose.

Figure 2
figure 2

Diversity in some unique maize landraces from Sikkim in India. Top row (left to right): Paheli makai; Seti makai; Kaali makai; Rathi makai; Putali makai. Bottom row: Chaptey makai; Kuchungtakmar; Bancharey makai; Kuchungdari; Gadbade makai.

Figure 3
figure 3

Expression of prolificacy in a ‘Sikkim Primitive’ accession in trials undertaken at (a) Bajaura (Himachal Pradesh); (b) Tadong (Sikkim).

Using Suwan-1, a popular OPV from Thailand, a composite ‘Parbhat’ has been developed at Punjab Agricultural University, Ludhiana (India), which shows resistance to multiple diseases, high yield and stability in performance (Dhillon et al. 2002). Improved germplasm that is well adapted to the hill areas have been derived at Vivekananda Parvatiya Krishi Anusandhan Sansthan (VPKAS), Almora, Uttarakhand (India), using landraces from the states of Jammu & Kashmir and Uttarakhand in India. The popular hybrids derived through this strategy include Him-129 (yellow, flint, 85–90 days maturity, highly tolerant to leaf blight), Him-128 and several ‘Vivek’ hybrids (Prasanna 2010). One of the popular baby corn cultivars in the Uttarakhand state in India, VL Baby Corn, a composite, has the prolific Murli makai (also locally called as Muralia) in its parentage.

4 Teosinte and its continuing relationship with maize

Exploring the genetic architecture of teosinte (the progenitor of maize) and analysing the gene flow from teosinte to maize that happened in the past (and that continues to happen in Mexico) are important not only for understanding maize domestication and evolution but also for effective decisions on in situ conservation of teosinte species (Wilkes 1977) and exploiting the potential of teosinte for further genetic enhancement of maize.

Genetic studies have provided firm evidence that maize was domesticated from Balsas teosinte (Zea mays subspecies parviglumis), a wild relative that is endemic to the mid- to lowland regions of southwestern Mexico. However, maize cultivars that are closely related to Balsas teosinte are found mainly in the Mexican highlands where the subspecies parviglumis does not grow. Genetic data thus point to diffusion of domesticated maize from the highlands rather than from the initially suggested region of initial domestication in the valleys of Mexico. By using SNP from a large number of accessions of both teosinte and maize, van Heerwaarden et al. (2011) showed that previous genetic evidence for an apparent highland origin of modern maize is best explained by gene flow from Z. mays ssp. mexicana (another teosinte subspecies) and demonstrated the ancestral position of lowland maize from western Mexico, a result that is consistent with archaeobotanical data and earlier studies on maize domestication.

A recent study reported how the insertion of a transposable element (Hopscotch) in the promoter region of an important teosinte gene (teosinte branched1, tb1) played an important functional role in causing alterations in gene expression that eventually impacted on maize evolution (Studer et al. 2011). Insertion of Hopscotch significantly enhanced the tb1 gene expression, helping the plant to produce larger ears with more kernels, and with less tillering; such plants were selected by the early farmers in Mexico leading to domestication of maize from teosinte. Ninety-five percent of modern maize appears to retain the tb1 mutant allele. In an earlier study, a team led by John Doebley also revealed that a single genetic mutation (in the teosinte glume architecture1, tga1) was responsible for removing the hard casing around teosinte’s kernels, exposing the soft grain, another significant step in the process of maize domestication.

Do the wild relatives of maize (e.g., teosintes) have a role to play in further genetic enhancement of maize? Wilkes (1977) documented three specific areas in Mexico and Guatemala where maize and teosinte hybridize; native farmers were reported to exploit the heterotic nature of maize resulting from this wide hybridization to improve their harvest. Despite the differences in ear and seed morphology between teosinte and maize, all species of teosinte can hybridize with maize under natural conditions. Crosses of maize with Z. mays ssp. mexicana and parviglumis are the most common and fertile, although a few crossing barriers to overcome (Ellstrand et al. 2007). Some of the Mexican maize landraces carry the alleles of the teosinte crossing barrier genes Gametophyte factor1 (Ga1) and possibly Teosinte crossing barrier1 (Tcb1) (Evans and Kermicle 2001). These genes may prevent maize pollen growth on teosinte, thus preserving the genetic identity of the teosinte populations, but generally do not stop teosinte pollen from hybridization, and further prevening the maize–teosinte hybrids from backcrossing to maize (Baltazar et al. 2005; Warburton et al. 2011).

The utility of wild relatives of maize (teosintes and Tripsacum dactyloides) for developing genetically improved maize was well illustrated by Rich and Ejeta (2008) in terms of resistance to the ‘witch weeds’ (Striga species), which are particularly prevalent in Africa. While there appears to be paucity of Striga resistance genes among maize landraces in Africa, although some resistance sources have been identified (Kim et al. 1999); both perennial teosintes (Zea diploperennis) and Tripsacum dactyloides showed relatively higher levels of resistance (Lane et al. 1997; Gurney et al. 2003). Through a long-term breeding effort, researchers from the International Institute of Tropical Agriculture (IITA) developed a Striga hermonthica–resistant inbred, ZD05; this inbred has in its pedigree a Zea diploperennis accession as well as tropical maize germplasm (Menkir et al. 2006; Amusan et al. 2008).

Thanks to the recent advances in maize genomics, it is now possible to undertake candidate-gene-based association genetic studies in teosinte. Weber et al. (2008) tested 123 markers in 52 candidate genes to find out their association with 31 traits in a population of 817 teosinte individuals, and revealed several new putative relationships between specific genes and trait variation in teosinte, for example, two ramosa genes (ra1 and ra2) with ear structure, and a MADS-box gene, zagl1, with ear shattering. The study clearly showed that candidate-gene-based association mapping could be a promising method for investigating the inheritance of complex traits in teosinte.

5 In situ and ex situ conservation of maize genetic diversity

Both in situ and ex situ conservations are vital to preserve the enormous genetic diversity present in maize, as these approaches are complementary (Maxted et al. 1997). In situ approaches are best suited for conservation of landraces or traditional varieties that have high value to the farmers as well as high genetic diversity (Smale and Bellon 1999), and those biodiversity-rich areas where farmers are less likely to substitute traditional varieties for improved ones for various socioeconomic, cultural or ethnic reasons (Smale et al. 2004)

5.1 In situ conservation

Maize farmers often make intensive efforts to maintain the genetic identity of their favourite local varieties or landraces due to a variety of reasons. For example, farmers in Honduras grow hybrids in valleys and local varieties on hillsides – the purpose of growing the varieties on the hillsides is to maintain the genetic purity (Almekinders et al. 1994). However, the local production systems can never be considered as static or closed. Gene exchanges among the maize landraces is often encouraged (a process called ‘creolization’) in many traditional farming systems (especially in Mexico) by the common practice of growing different varieties on adjacent areas, and continually selecting seed of these varieties for replanting. Similarly, in Costa Rica and Honduras, Almekinders et al. (1994) found that hybridization between local and improved maize is highly valued by farmers.

Varieties derived through creolization (popularly referred to as Criollo varieties) provide an opportunity to the smallholder and poor farmers for gaining access to improved technology and adapt the resultant varieties to their local conditions without the cost of buying seed every year (Bellon and Risopoulos 2001). In addition to the gains to the farmers, these varieties also provide a good case-study to researchers for documenting evidence of gene flow and perhaps even rates of gene flow in a maize ecosystem. If the improved formal variety is a hybrid (which has a known genetic constitution), it is possible to analyse and/or predict the allelic profiles of the progeny. Deviations from the initial genetic or molecular marker profile of the hybrid can be attributed to gene flow. If the genes flowing in are from traditional varieties or landraces, these criollos could be an important, overlooked reservoir of genetic diversity from traditional varieties.

Traditional maize farmers also show uncanny sense of retaining the broad genetic identity of the local varieties. For example, of the 26 maize varieties grown in the Cuzalapa Valley in the Mexican state of Jalisco, only 6 can be considered truly local. Yet, the Cuzalapa farmers demonstrated an impressive ability to manage the local varieties in ways that avoid the two undesirable extremes of too much gene flow between local varieties and those introduced from other regions in Mexico (which can lead to uniformity in subpopulations), and too little gene flow which might lead to inbreeding (Louette and Smale 2000).

Although 21% to 54% of maize farmers surveyed earlier in Central America, Guatemala, Nicaragua, India and Malawi are growing Criollo varieties, there is very little published work on the genetic effects of farmers’ seed management (Morris et al. 1999). At the same time, in some situations, gene flow from the improved varieties to landraces could also be of concern, in light of the increasing cultivation of genetically engineered maize (Bellon and Berthaud 2004). Another important aspect that needs greater attention in the coming decades is how the diversity of landraces or farmers’ varieties will be affected in the future by the changing climate, and what strategies are needed to conserve the genetic diversity.

5.2 Ex situ conservation

In the Wellhausen-Anderson Maize Genetic Resources Center in CIMMYT, El Batan, Mexico, over 27,000 samples of maize seed, including the world’s largest collection of maize landraces (24,191), along with samples of wild relatives of maize (teosintes and Tripsacum), breeding lines, gene pools, populations and cultivars, are preserved. These samples were collected from 64 countries: 19 in Latin America, 19 in the Caribbean, 11 in Africa, 10 in Asia, 3 in Europe and 2 in Oceania, and represent nearly 90% of maize diversity in the Americas (Ortiz et al. 2010; Wen et al. 2011).

In addition, several national gene banks have also been collected, conserved, studied, documented, used and distributed accessions of maize germplasm. For example, the gene banks at the Instituto Nacional de Investigaciones Forestales y Agropecuarias (INIFAP, Mexico), USDA-ARS and Universidad de Guadalajara (Mexico) hold major collections of teosinte accessions (Ortiz et al. 2010). The China National Gene Bank in Beijing holds a large collection of maize landraces (~14,000 samples). Similarly, about 7500 maize landrace accessions are available in the National Gene Bank at the National Bureau of Plant Genetic Resources (NBPGR), New Delhi, including several diverse landraces collected from the Northeastern Himalayan (NEH) region in India, comprising Arunachal Pradesh, Assam, Meghalaya, Mizoram, Manipur, Nagaland, Tripura, Sikkim and some areas in the northern region of West Bengal.

A large number of maize mutant stocks (>80,000 accessions) are conserved and annotated by the Maize Genetic Cooperation Stock Center (or USDA-ARS GSZE) in the Department of Crop Sciences, University of Illinois, USA. Description of all the maize mutant stocks of this collection can be accessed at MaizeGDB, the Maize Genetics and Genomics Database (http://www.maizegdb.org).

Although the storage phase of gene bank conservation is considered very stable, there are chances for genetic changes to occur during collection (often due to inadequate sample size) and seed regeneration. Differences in adaptability and problems in seed setting often impose additional challenges to proper regeneration (Taba and Twumasi-Afriyie 2008). However, seed multiplication and regeneration of the accessions is inevitable to cater to the seed requests for further characterization/use by researchers. The accessions are usually considered for regeneration when the seed viability of the accession drops below 85% or if the number of seeds falls below 1500 (Taba et al. 2004).

Unless due precautions are taken, the regeneration process is potentially prone to a source of genetic change for accessions in the system, due to bottlenecks, inbreeding, random genetic drift and unintentional mixing or contamination (Crossa et al. 1994). Wen et al. (2011) demonstrated the utility of molecular markers for understanding the extent of changes in the genetic purity of the maize accessions during regeneration adopted by the ex situ gene banks, and recommended the best practices for maintaining the original genetic diversity of the gene bank accessions. They analysed 20 maize landrace accessions regenerated and conserved in five maize gene banks for genetic purity using 1150 Single Nucleotide Polymorphisms (SNPs) and 235 SNP haplotypes. Both SNP and haplotype analyses revealed dynamic changes in genetic purity during regeneration in terms of loss of alleles from original accession or presence of non-parental alleles.

6 Broadening the genetic base of cultivated maize

Despite the fact that maize has enormous genetic diversity, and extensive collections of maize are maintained by the international and national maize gene banks, breeders generally confine their research programs to germplasm having a relatively narrow genetic base. The narrow genetic base of the North American hybrid corn industry has been well documented (Goodman 2005). Although there are 250 to 300 maize races available (Goodman and Brown 1988), the Corn Belt Dent is by far the predominant source of commercial germplasm. Of the hundreds of open-pollinated varieties of Corn Belt Dent that were cultivated in the 1940s, only half a dozen or so can be considered as significant contributors to the development of current inbreds; the predominant donors are the Reid Yellow Dent and Lancaster Surecrop varieties (Goodman 1988). Goodman (2005) indicated that the parentage of virtually all commercial US hybrids involves six inbred lines or their close relatives, namely, the Lancaster-type inbreds C103, Mo17 and Oh43, and the Reid-type lines B37, B73 and A632.

The situation in China, the second-largest maize growing country in the world (with ~32 million hectares under maize), is not much different from that of the US. To date, over 5500 maize varieties have been approved for commercial cultivation in China ( http://www.newcorn.com.cn ); more than 2000 varieties have been given plant variety protection rights ( http://www.cnpvp.cn ), and over 1000 varieties are being inspected in national and regional trials each year. However, maize hybrid genetic base in China has been reported to be quite narrow, with only a few inbred lines having played a central role in hybrid development, such as Mo17, Huangzaosi, 330, E28, Dan340 and 478 (Li 1998; Yu et al 2007).

The reasons for the above are straightforward: (i) in most crop breeding programs, maize being no exception, there is an unwillingness to ‘dilute’ the present-day elite stocks with unimproved germplasm, as development of elite inbreds has taken several generations of intensive breeding to bring to their present level of agronomic performance (Kannenberg and Falk 1995); and (b) even if there is willingness, most of the public sector maize breeding programmes, especially in the developing countries, do not have adequate resources or strategies to devote for systematic characterization and utilization of landraces or exotic maize germplasm.

In the above context, it is important to highlight two most notable and successful examples of concerted institutional efforts or collaborative networks to broaden the genetic base of maize:

6.1 Latin American Maize Program (LAMP)

LAMP was the first internationally coordinated project (1987–1996) for evaluation of maize germplasm. This project aided in generating information by evaluation of the maize germplasm in 11 Latin American countries (Argentina, Bolivia, Brazil, Colombia, Chile, Guatemala, Mexico, Paraguay, Peru, Uruguay and Venezuela) and the US, and facilitated the breeders to access this information and create superior varieties and hybrids.

Evaluation trials under LAMP were conducted in 34 regions, which covered most of the America’s maize growing areas, from 41° latitude in the north to 34° latitude in the south, and from sea level to 3400 msl. Besides yielding ability, the important agronomic traits evaluated in LAMP were standability (root-lodging and broken stalks), earliness, and plant and ear height. A LAMP core subset has been made available to encourage further use in broadening the maize genetic base (Taba et al. 1999). LAMP, thus, assessed the diversity in the national maize germplasm collections and facilitated the exchange of genetic resources across Latin America (Salhuana and Pollak 2006).

6.2 The US-Germplasm Enhancement of Maize (US-GEM) Project

The US-GEM Project was a collaborative research effort of the USDA-ARS, land grant universities, private industry, and international and non-governmental (NGO) organizations to broaden the germplasm base of maize. The primary purpose of the project was to introgress useful genetic diversity from Latin American maize races and other tropical maize donor sources (lines and hybrids) into US maize germplasm to broaden the genetic base of the corn belt hybrids (Balint-Kurti et al. 2006; Goodman 2005). The project used the Latin American landrace accessions selected by LAMP and crossed them with elite temperate maize lines provided by private companies in North America (Salhuana and Pollak 2006).

7 Molecular diversity in the global maize germplasm

Molecular characterization of maize landraces of Americas and Europe (e.g. Warburton et al. 2011), and more recently of India (e.g. Prasanna et al. 2010; Sharma et al. 2010), led to significant insights with regard to the genetic diversity and population structure. Studies using molecular markers provided new insights into domestication events in maize (e.g. Matsuoka et al. 2002), understanding phylogenetic relationships and gene flow between maize landraces and the wild progenitor, teosinte (e.g. Warburton et al. 2011; van HeerWaarden et al. 2011), assessing the patterns of genetic diversity in the maize gene pool and tracking the migration routes of maize from the centers of origin (e.g. Rebourg et al. 2003; Vigouroux et al. 2008; Marilyn Warburton, personal communication), identifying genes of agronomic importance in maize by screening microsatellites for evidence of selection during domestication (e.g. Vigouroux et al. 2002), formulating sampling strategies for conserving maize diversity (e.g. Franco et al. 2005), and analysing the impact of farmers’ management on maize landraces especially in areas where maize was first domesticated (e.g., Pressoir and Berthaud 2004).

7.1 SSR markers-based diversity analysis

Microsatellite or Simple Sequence Repeat (SSR) markers have been used to characterize CIMMYT tropical, sub-tropical and temperate maize breeding materials (Reif et al. 2004) and CIMMYT highland and mid-altitude lines bred in Africa (Legesse et al. 2007), and to compare CIMMYT breeding populations with inbred lines with maize landraces from Mexico (Warburton et al. 2008). Other recent studies with SSR markers include characterization of indigenous landraces of Argentina (Bracco et al. 2009), of China (e.g. Qi-Lun et al. 2008), highland maize accessions of Ethiopia (Beyene et al. 2006), maize germplasm of Portugal (Patto et al. 2004), of Switzerland (Eschholz et al. 2006), and of India (Prasanna et al. 2010; Sharma et al. 2010).

Characterization of genetically heterogeneous populations using molecular markers has been until recently very expensive and time consuming because variation tends to be partitioned within, rather than between, maize populations, and levels of variation can be very high. This means that at least 15 individuals must be characterized in order to adequately represent the allelic diversity present in a population (Dubreuil et al. 2006; Warburton et al. 2010). A new method for SSR analysis of pools of individuals from a population has proved to be much more efficient than genotyping multiple individuals per population, and much more accurate than genotyping only one individual per population (Warburton et al. 2002; Dubreuil et al. 2006). DNA fingerprinting (and thus, distinguishing)-improved open-pollinated varieties (OPVs) or synthetics is possible using SSR markers based on a population bulk DNA fingerprinting technique developed at CIMMYT (Warburton et al. 2010).

Using the population bulk fingerprinting strategy using a carefully selected set of SSR markers, ~800 global maize landraces/populations have been characterized recently under a Generation Challenge Program (GCP) project, which involved researchers from CIMMYT, INRA (France), IITA and national programmes of China, India, Indonesia, Thailand, Vietnam and Kenya. The study led to the first time assessment of genetic relationships among landraces/populations worldwide, compared to the country of origin, Mexico, besides indicating the possible migration routes of maize from Mexico to diverse continents (Marilyn Warburton, personal communication).

7.2 SNP markers-based diversity analysis

Until recently SSR markers were the choice for DNA fingerprinting and genetic diversity analysis in maize. However, advances in high-density genotyping technologies, coupled with drastic reduction in genotypic costs, resulted in a shift toward SNPs, particularly in model plants with substantial genomic information and resources like maize. Some of the important comparative advantages of SNPs over SSRs, especially for diversity analysis in crops like maize, are as follows:

  1. (a)

    Although SNPs may not be as informative as SSRs, it is possible to use the SNPs for such studies since the high automation may enable analysis of 15–30 individuals per OPV at a time for the same cost, once the platform is set up. However, Yu et al. (2009) suggested over 10 times more SNPs than SSRs should be used to estimate relative kinship, while Inghelandt et al. (2010) proposed between 7 and 11 times should be used to infer population structure in maize association analysis.

  2. (b)

    Compared with SNPs, SSRs have higher genotyping error rate and higher levels of missing data (Jones et al. 2007). In contrast, SNPs are bi-allelic, represent the smallest units of genetic variation in the genome, allelic data are easily read, compared, and integrated between different datasets, and amenable to high-density genotyping and automation; thus, SNP genotyping can provide increased marker data quality and quantity compared with SSRs (Jones et al. 2007; Hamblin et al. 2007).

  3. (c)

    An array-based SNP detection method was 100 times faster than gel-based SSR detection method, and at the same time, the cost was 4–5 times lower for constructing linkage maps (Yan et al. 2010).

Molecular characterization of 770 maize inbred lines with 1034 SNP markers has been recently undertaken at CIMMYT, leading to identification of 449 high-quality markers in terms of repeatability and no germplasm-specific biasing effects (Lu et al. 2009). Combined use of SNP haplotypes (information from several SNPs within the same gene or locus) may be far more powerful than using SNP alleles alone in diversity analyses (Hamblin et al. 2007; Yan et al. 2010).

8 Next-generation sequencing and high-density genotyping

The genome sequencing of B73 (Schnable et al. 2009) and Palomero, a popcorn landrace in Mexico (Vielle-Calzada et al. 2010), are important landmarks in maize genome research, with significant implications to our understanding of the maize genome organization and evolution, as well as to formulate strategies to utilize the genomic information in maize breeding. The Palomero genome is about 22% (140 Mb) smaller than that of B73, and shows a large number of hitherto unreported sequences, implying a large pool of unexplored alleles. Also, more than 12 genes related to heavy-metal detoxification and environmental stress tolerance were found to be conserved in B73 and Palomero, but absent from teosinte, suggesting that these genes were possibly involved in the domestication process (Vielle-Calzada et al. 2010).

Another important recent development is the availability of platforms for undertaking next-generation sequencing and high-density genotyping (Metzker 2010). Elshire et al. (2011) reported a procedure for constructing genotyping-by-sequencing (GBS) libraries based on reducing genome complexity with restriction enzymes (REs), which is simple, quick, extremely specific, highly reproducible, and may reach important regions of the genome that are inaccessible to sequence capture approaches. By using methylation-sensitive REs, repetitive regions of genomes can be avoided and lower copy regions targeted with two- to three-fold higher efficiency. This tremendously simplifies computationally challenging alignment problems in species like maize with high levels of genomic diversity. The GBS procedure is demonstrated with maize (IBM; Intermated B73 x Mo17) and barley (Oregon Wolfe Barley) recombinant inbred populations mapped about 200,000 and 25,000 sequence tags, respectively (Elshire et al. 2011). Using the GBS system, large-scale high-density genotyping is being employed by the CIMMYT Global Maize Program for improvement of complex traits, and several billion data points have already been generated on the key germplasm.

The new genotyping/sequencing technologies and in silico tools now provide immense opportunities for the maize community to speed up research progress for large-scale diversity analysis, high-density linkage map construction, high-resolution QTL mapping, linkage disequilibrium (LD) analysis and genome-wide association studies. Because the genomic sequence of maize is publicly available ( http://www.maizesequence.org/index.html ), re-sequencing of selected maize landraces as well as inbred lines of importance is now feasible, which can provide a snapshot of the allelic state of every SNP in the genome, and provide opportunities for gene discovery.

In addition to powerful next-generation sequencing and genotyping systems, diverse mapping populations are available in maize as international maize genomic resources. For example, the maize ‘nested association mapping’ (NAM) population, comprising 5000 RILs (200 RILs from each of 25 founders), is an important genetic resource developed in recent years. The NAM population is a novel approach for mapping genes underlying complex traits, in which the statistical power of QTL (Quantitative Trait Loci) mapping is combined with the high resolution of association mapping (Yu et al. 2008). Global genetic diversity of maize has been captured in the NAM RILs, which will provide the maize research community with the opportunity to map genes associated with agronomic traits of interest.

With next-generation DNA sequencing technology (Shendure and Ji 2008), it will be possible to sequence the sequence the whole gene bank collection. Maize is the first plant species with a haplotype map (HapMap) constructed. Gore et al. (2009) identified and genotyped several million sequence polymorphisms among 27 diverse maize inbred lines and discovered that the genome was characterized by highly divergent haplotypes. Haplotype-based mapping can be used to replace individual marker-based mapping to improve the mapping power and identify specific alleles in a gene or allele combinations at different loci that contribute to the same target trait (Xu et al. 2012).

9 Seeds of Discovery (SeeD): A bold new initiative

A new initiative of CIMMYT, titled ‘Seeds of Discovery’ (SeeD), aims to discover the extent of allelic variation in the genetic resources of maize and wheat through high-density genotyping, phenotyping for prioritized traits and novel bioinformatics tools, and make available the favourable alleles and haplotypes associated with important traits to the breeders in an usable form.

Some of the potential/expected outputs of SeeD are: (a) an understanding of the frequency and distribution of haplotypes among the Meso-American landraces and CIMMYT Maize Lines (CMLs) available in the CIMMYT’s Maize Gene Bank; (b) identification of large-effect genes/QTLs for prioritized abiotic and biotic stress tolerance, nutritional and industrially important quality traits; (c) generation of improved source populations from under-sampled components of maize genetic diversity and identification/development of donor inbreds for use in breeding; and (d) designing practical delivery paths that enable targeted users to adopt novel and useful maize genetic diversity in their breeding programs.

10 Conclusions

The current revolution in DNA technologies offers tremendous opportunities to understand the genetic relationships, diversity and evolution of maize. Molecular-marker-based diversity assessment has provided valuable information on the extent and distribution of genetic diversity in global maize germplasm. Next-generation sequencing and high-density genotyping technologies, including GBS, will provide greater insights into the structure and organization of maize genome, and speed up the discovery and use of new and useful alleles for maize improvement.

Additionally, intensive and concerted efforts (e.g. LAMP and US-GEM) are needed for a better understanding of the breeding value of the maize genetic resources available worldwide. Such initiatives would lead to development of new and improved varieties, with potential for more direct use by farmers and appropriate for specific agro-ecologies. ‘Seeds of Discovery’, CIMMYT’s new initiative, aims to discover the extent of allelic variation in the maize germplasm available or conserved in the Gene Bank, formulate core sets based on genotyping and phenotyping, and utilize those rare useful alleles into breeding programmes for developing improved cultivars.

The key challenges to the international maize scientific community are: (a) to generate high-quality phenotypc data of landraces, besides elite maize inbreds, and integrate the same with high-density genotyping data for understanding and utilizing the enormous genetic diversity to broaden the genetic base of cultivated maize; (b) to better understand the effects of climate change on diversity of maize landraces in different regions; and (c) to effectively monitor the patterns of change both temporally and spatially (= meta-population dynamics), coupled with appropriate policies and actions at the farm level. As Walbot (2009) stated

The overarching question now is how we can use the unprecedented genetic tool that the maize genome offers to improve corn productivity per unit of land while reducing inputs such as water and fertilizer so that we can sustain humanity’s food requirements, while also decreasing the negative impacts of agriculture on the Earth.