Access provided by Autonomous University of Puebla. Download reference work entry PDF
Definition of the Subject
In the last decade, private seed companies have benefitted immensely from molecular breeding (MB) [1]. A private sector-led “gene revolution ” has boosted crop adaptation and productivity in developed countries, by applying and combining the latest advances in molecular biology with cutting-edge information and communication technologies combined with accurate plant phenotyping.
MB allows the stacking of favorable alleles, or genomic regions, for target traits in a desired genetic background thanks to the use of polymorphic molecular markers (MMs) that monitor differences in genomic composition among cultivars, or genotypes, at specific genomic regions, or genes, involved in the expression of those target traits. The use of MMs generally increases the genetic gain per crop cycle compared to selection based on plant phenotyping only, and therefore reduces the number of needed selection cycles, hastening the delivery of improved crop varieties to the farmers.
In contrast to the private sector, MB adoption is still limited in the public sector, and is hardly used at all in developing countries. This is the result of several factors, among which are the following: (1) scientists from the academic world are more interested in discovering new genes or QTLs to be published than in applied biology; (2) until recently access to genomic resources was limited in the public sector, especially for less-studied crops; (3) public access to large-scale genotyping facilities was not easily available; and (4) although a broad set of stand-alone tools are available to conduct the multiple types of analyses necessitated by MB, no single analytical pipeline is available today in the public sector allowing integrated analysis in a user-friendly mode.
The situation is even more critical in developing countries as additional limitations include shortage of well-trained personnel, inadequate laboratory and field infrastructure, lack of ISs with applicable and flexible analysis tools, as well as inappropriate funding – simply put, resource-limited breeding programs. As a result, the developing world has yet to benefit from the MB revolution, and most of the countries indeed lack the fundamental prerequisites for a move to informatics powered breeding.
Under those circumstances, developing and deploying a sustainable web-based Molecular Breeding Platform (MBP) as a one-stop shop for information, analytical tools, and related services to help design and conduct marker-assisted breeding experiments in the most efficient way will alleviate many of the bottlenecks mentioned earlier. Such a platform will enable breeding programs in the public and private sectors in developing countries to accelerate variety development using marker technologies for different breeding purposes: major genes or transgene introgression via marker-assisted backcrossing (MABC), gene pyramiding via marker-assisted selection (MAS), marker-assisted recurrent selection (MARS) and, in a not too distant future, genome-wide selection (GWS).
Introduction
Since the dawn of agriculture, mankind has sought to improve crops by selecting individual plants with the most desirable characteristics or traits. Agricultural productivity has been progressively enhanced by constant innovation, including improved crop varieties to increase production in specific environments [2]. The major objective of crop improvement is to identify within heterogeneous materials those individuals for which favorable alleles are present at the highest proportion of loci involved in the expression of key traits [3]. The classical plant breeding method is based on increasing the probability of selecting such individuals from populations generated from sexual matings. Selection has traditionally been carried out at the whole-plant level (i.e., phenotype), which represents the net result of genotype and environment (and their interactions). Phenotypic selection has delivered tremendous genetic gains in most cultivated crop species, but is severely limited when faced with traits that are heavily modulated by the environment [4]. In addition, the nature of some traits can make the phenotypic testing procedure itself complex, unreliable, or expensive (or a combination of these).
The recent remarkable development of molecular genetics and associated technologies represents a quantum leap in our understanding of the underlying genetics of important traits for crop improvement. The ongoing revolutions in molecular biology and information technology offer tremendous and unprecedented opportunities for enhancing the effectiveness and efficiency of MB programs. Indirect selection, based on genetic markers, presents an efficient complementary breeding tool to phenotypic selection. Individual genes or QTLs having an impact upon target traits can be identified and linked with one or more markers, and then the marker loci can be used as a surrogate for the trait, resulting in greatly enhanced breeding efficiency [5–8].
Molecular techniques can have an impact upon every stage of the breeding process from parental selection and cross prediction [9], to introgression of known genes [10] and population enhancement. Selection of beneficial alleles of known genes can be done through marker-assisted selection (MAS) – the selection of specific alleles for traits conditioned by a few loci [10] – or through marker-assisted backcrossing (MABC) – transferring specific alleles of a limited number of loci from one genetic background to another, including transgenes [11, 12]. For marker-assisted population improvement, individuals selected from a segregating population based on their marker genotype are inter-mated at random to produce the following generation, at which point the same process can be repeated a number of times [13]. A second approach aims at direct recombination between selected individuals as part of a breeding scheme, seeking to generate an ideal genotype or ideotype [14]. The ideotype is predefined on the basis of QTL mapping within the segregating population, combined with the use of multi-trait selection indices that can also consider historical QTL data. This variety development approach is commonly referred to as marker-assisted recurrent selection (MARS) [15–17], or genotype construction. An alternative is to infer a predictive function using all available markers jointly, without significant testing and without identifying a priori a subset of markers associated with the traits of interest. This more recent approach coming from genomic medicine [18, 19], and then applied successfully in animal breeding [20] named genome-wide selection (GWS) , also appears to be quite promising in crop improvement [7].
Concomitantly with the evolution of marker technologies becoming increasingly “data rich,” the amount of data produced by plant breeding programs has increased dramatically in recent years. Increasingly, the critical factor determining the rate of progress in plant breeding programs is their capacity to manage large amounts of data efficiently and subsequently maximize the timely extraction of meaningful information from that data for use in selection decisions. If genotyping has become less of an issue, the efficient management of genotyping data in a broad sense, including sequence information, is increasingly becoming a major challenge in modern plant breeding. This was recognized early on in the private sector where the establishment of platforms or pipelines integrating field and laboratory processes with powerful data management systems (DMS) that merged and analyzed the data collected at every step and guided the process of crop improvement toward the release of improved cultivars has been the key to successful adoption of MB.
A few initiatives have taken place in the public sector to establish efficient data management or ISs [21, 22]. One of these has been led by several centers of the Consultative Group on International Agricultural Research (CGIAR) which have worked over the past decade, along with advanced research institutes (ARIs) and national agricultural research systems (NARS) in developing countries, to develop an open-source generic IS, the International Crop Information System (ICIS), to handle pedigree information, genetic resource, and crop improvement information [23]. Based on some elements of ICIS, the CGIAR Generation Challenge Programme (GCP, http://www.generationcp.org) has invested in integrating crop information with genomic and genetic information and in using existing or developing new public decision-support tools to access and analyze information resources in an integrated and user-friendly way [24]. Another initiative has been led by Primary Industries and Fisheries (PI&F) of the Queensland Government Department of Employment, Economic Development and Innovation in Australia, which recognized that effective data management is an essential element in obtaining maximum benefit from their investment in plant breeding. In conjunction with the New South Wales Department of Primary Industries (NSW DPI) and more recently Dart Pty Ltd (http://www.diversityarrays.com/) they are in the process of developing a linked IS for plant breeding (Katmandoo) that includes applications for capturing field data using hand-held computers, barcode-based seed management systems, and databases to store and link field trial data, laboratory data, genealogical data, and marker data [25].
Although an IS involves far more than a database, the development and implementation of a suitable database system alone remains a real challenge because of the fast turnover in technologies, the need to manage and integrate increasingly diverse and complex data types, and the exponential increase in data volume. Previous solutions, such as central databases, journal-based publication, and manually intensive data curation, are now being enhanced with new systems for federated databases, database publication, and more automated management of data flows and quality control. Along with emerging technologies that enhance connectivity and data retrieval, these advances should help create a powerful knowledge environment for genotype–phenotype information [26].
In addition to efficient data management, advances in statistical methodology [27–29], graphical visualization tools, and simulation modeling [9, 30–32] have greatly enhanced these ISs. The availability of molecular data linked to computable pedigrees [33] and phenotypic evaluation now makes genotype–phenotype analysis a practical reality [34].
In order to realize the full potential of marker technologies and bioinformatics in plant breeding, tools for molecular characterization, accurate phenotyping , efficient ISs, and effective data analysis must be integrated with breeding workflows managing pedigree, phenotypic, genotypic, and adaptation data. The goals of this integration of technologies are to (1) create genotype–phenotype trait knowledge for breeding objectives, and (2) use that knowledge in product development and deployment [4].
This entry generally explores the pace of innovation in world agriculture and the rise of MB. It particularly illustrates the accelerating application of information and communication technologies to the information management challenges of MB and, as a result, the emergence of virtual molecular breeding platforms (MBPs) as a vital tool for accelerating genetic gains and rapidly developing more resilient and more productive cultivars.
This entry reviews the rationale for access to MB technology and services and the status of existing public analytical pipelines and ISs for MB, and offers a detailed case study for the CGIAR GCP Integrated Breeding Platform (IBP) – the pioneer public sector MBP specifically targeting developing country breeding programs. It explores the gaps between countries and between crops in the application of informatics-powered MB approaches, and the potential for adopting MBPs to close these gaps; and it reviews institutional, governmental, and public support for these approaches. The entry discusses the challenges and opportunities inherent in MBPs, and the potential economic impact of MB. Finally, the entry explores the future directions and perspectives of MBPs.
Marker Technologies and Service Laboratories
Markers are “characters” whose pattern of inheritance can be followed at the morphological (e.g., flower color), biochemical (e.g., proteins and/or isozymes), or molecular (DNA) levels. They are so called because they can be used to elicit, albeit indirectly, information concerning the inheritance of “real” traits. The major advantages of molecular over other classes of markers are that their number is potentially unlimited, their dispersion across the genome is complete, their expression is unaffected by the environment and their assessment is independent of the stage of plant development [35]. During the past two decades, DNA technology has been exploited to advance the identification, mapping, and isolation of genes in a wide range of crop species. The first generation of DNA markers, restriction fragment length polymorphisms (RFLPs) , was used to construct the earliest genome-wide linkage maps [36] and identify the first QTLs [37, 38]. During the 1990s, emphasis switched to assays based on the polymerase chain reaction (PCR), which are much easier to use and potentially automatable [39]. The development of simple sequence repeats (SSRs) [40], amplified fragment length polymorphisms (AFLPs) [41], and single nucleotide polymorphism (SNP) [42] opened the door for large-scale deployment of marker technology in genomics and progeny screening.
SNPs are amenable to very high throughput and a wide range of detection techniques has been developed for them, from singleplex systems to high-density arrays. They can be used in fully integrated robotic systems going from automated DNA extraction to automated scoring in high-throughput detection platforms. The combination of increase in throughput and lowering in costs makes SNPs highly suitable to intensive marker applications in plant breeding such as MARS and the emerging approach of GWS. Based on SNP technology, production of molecular marker (MM) data expanded more than 40-fold between 2000 and 2006 at Monsanto, while cost per data point decreased to one sixth of the original cost [43].
With the transition from SSRs to SNPs and the concomitant large increase in the demand for genotyping as markers get more and more widely used in a broad range of applications from medicine to plant breeding, marker genotyping laboratories have evolved from relatively low-tech operations to highly automated, high-throughput laboratories using an array of sophisticated equipment (pipetting robots, high-density PCR, high-throughput SNP detection machines, high-level informatics). Although large private seed companies have had the need and the resources to put in place large-scale genotyping laboratories for their own uses, smaller programs, especially in the public sector, have typically not had the resources or the justification to establish such large operations to respond to their increasing need for SNP genotyping data. In response to this need, a few private marker service laboratories have sprung up over the past few years, which can provide complete genotyping services for their customers, from DNA extraction to generation of large numbers of SNP or other datapoints. Due to their broad customer base (from medical research laboratories to animal and plant breeding operations, both public and private), these laboratories can have a large volume of datapoint production which may lead to low costs for the customer and high throughput. They are able to invest in the most advanced equipment to keep up with the constant evolution of genotyping technologies and are able to pass on the resulting benefits to their customers. Processes have now been put in place for rapid shipment of leaf samples from any location (field or laboratory) around the world without any restrictions. Examples of such companies that can service breeding programs from around the world are DNA LandMarks, Inc. of Saint-Jean-sur-Richelieu, Quebec, Canada (http://www.dnalandmarks.ca/english/) and KBioscience Ltd. of Hoddesdon Herts, UK (http://www.kbioscience.co.uk/). For many public breeding programs and small companies, especially in developing countries, it is now more efficient to use those types of contract genotyping services than to try to support their growing MB needs through the establishment of an in-house laboratory. Functional and reliable SNP laboratories are especially difficult to establish in many developing countries due to the unreliability of the power supply, difficulties in shipping and storing and a low level of resources for the purchase and maintenance of sophisticated equipment. The GCP is facilitating the linkage between users and service laboratories through its marker services, a component of the breeding services offered through the GCP’s IBP.
Analytical Tools, Software, and Pipelines
One of the achievements of the plant biotechnology revolution of the last two decades has been the development of molecular genetics and associated technologies, which have led to the development of an improved understanding of the basis of inheritance of agronomic traits. The genomic segments or QTLs involved in the determination of phenotype can be identified from the analysis of phenotypic data in conjunction with allelic segregation at loci distributed throughout the genome. Because of this, the mode of inheritance, as well as the gene action underlying the QTL, can be deduced [44]. As with the improvement in marker technologies, the statistical tools needed for QTL mapping have evolved from a rudimentary to a very sophisticated level [45]. Previous approaches based on multiple regression methods, using least squares or generalized least squares estimation methods [46, 47], have evolved to composite interval mapping [9], mixed model approaches using maximum likelihood or restricted maximum likelihood (REML) [48], and Markov Chain Monte Carlo (MCMC) algorithms [49, 50], which use Bayesian statistics to estimate posterior probabilities by sampling from the data. In parallel, with progress in the characterization of genetic effects at QTLs and refinement of QTL peak position through meta-analysis [51], advances have also been made in understanding the impact of the environment on plant phenotype. The mapping of QTLs for multiple traits has allowed the quantification of QTL by environment interaction (QEI) [52] and, more recently, approaches using factorial regression mixed models have been applied to model both genotype by environment interaction [53]and QEI [48, 54, 55]. Recent approaches are now implemented to evaluate gene networking [56] and epistasis, based on Bayesian approaches [57, 58] or through stepwise regression by considering all marker information simultaneously [59, 60]. Epistasis and balanced polymorphism influence complex trait variation [61, 62], and classical generation means analyses, estimates of variance components, and QTL mapping indicated an important role of digenic and/or higher-order epistatic effects for all biomass-related traits in model plants [63] and in crops [64–66]. It will be critical to implement the most efficient MB strategies in order to evaluate and include these genetic effects in breeding schemes [60].
All tools necessary to run MB projects, from the simplest to the most complicated approaches, are available today in the public domain. They are based on different algorithms and statistical approaches, from the very simple to the more complex. One challenge is the diversity of tools available for a given analytical function or along the different steps of an analytical pathway, making the choice of the “right” tool difficult and the move from one analytical step to the next very tedious due to the complete lack of common standards and formatting across tools. The number of applications available for QTL analysis illustrates well the multiplicity and diversity of tools that are available for a given analysis. The following software packages have been developed over the past 20 years:
For most of these applications, the first versions were already available 15 years ago and the multiplicity and possible duplication generated by the independent development of these tools were already identified at the Gordon Research Conference on Quantitative Genetics and Biotechnology held in February 1997 in Ventura, California. A main objective of that workshop was to survey participants on the attributes of several software packages for QTL mapping and to define their analytical needs which were not presently met by the existing software packages. The workshop covered software for QTL mapping in inbred and outcrossed populations and the conclusions are available at: http://www.stat.wisc.edu/~yandell/statgen/software/biosci/qtl.html. In those conclusions one can read that “[a] consensus was reached that there is considerable overlap in the kinds of matings handled and statistics produced by the various QTL mapping software packages,” clearly identifying the need for better coordinated efforts. Such coordination never took place, as is often the case in public research. As a result, most of those QTL packages are still available today, although in more sophisticated versions. They are all suitable for QTL mapping but use different statistical algorithms, present a different user interface, and necessitate different input and output file formats.
Some specialists in the field realized that the public software packages are usually too specialized and too technical in statistics to permit a thorough understanding by the many experimental geneticists and molecular biologists who would want to use them. In addition, the fast methodological advances, coupled with a range of stand-alone software, make it difficult for expert as well as non-expert users to decide on the best tools when designing and analyzing their genetic studies. Based on this rationale, a few commercial analytical pipelines emerged about a decade ago that include some of the QTL packages mentioned above. Two of them are Kyazma and GenStat®. These applications assist plant scientists by providing easy access to statistical packages for phenotypic and genotypic data. Kyazma was founded in the spring of 2003 (http://www.kyazma.nl/), and offers powerful methods for genetic linkage mapping and QTL analysis. Since 2003 Kyazma has taken over the development of the software packages JoinMap® and MapQTL® from Biometris of Plant Research International. Kyazma handles the distribution and support of JoinMap and MapQTL and, in collaboration with the statistical geneticists of Biometris, Kyazma provides introductory courses on genetic linkage mapping and QTL analysis in order to make the use of the software even more accessible. GenStat encompasses statistical data analysis software for biological and life science markets worldwide. GenStat includes the ASReml algorithm (average information algorithm for REML) to undertake very efficient meta-analyses of data with linear mixed models. The development of GenStat at Rothamsted began in 1968, when John Nelder took over from Frank Yates as Head of Statistics. Roger Payne took over leadership of the GenStat activity when John Nelder retired in 1985 (http://www.vsni.co.uk/). An important feature of GenStat is that it has been developed in (and now in collaboration with) a Statistics Department whose members have been responsible for many of the most widely used methods in applied statistics. Examples include analysis of variance, design of experiments, maximum likelihood, generalized linear models, canonical variates analysis, and recent developments in the analysis of mixed models by REML.
These commercial analytical pipelines offer a set of quality tools to researchers in plant science. However, they cover only a part of the configurable workflow system that is required for integrated breeding activities. In addition, there is a need to have tools and analytical pipelines that are freely available and, if possible, based on open source code to avoid dependence on private companies that might discontinue support and ensure access to the tools even with limited financial resources, which is a critical constraint in the arena of research for development, of which breeding programs of developing countries are key partners. It is important to underline that a version of GenStat that does not include the most advanced version of the different tools but allows users to run most basic analyses is available for breeding programs in developing countries. The web site for the GenStat Discovery Edition is http://www.vsni.co.uk/software/genstat-discovery/, but this version of the pipeline does not include QTL selection based on the mixed model approach, which is available in the commercial version.
The issue of open source code is an important one as, even for freely-available tools, the lack of availability of the source code limits the further expansion and customization of the tools. It also reduces the opportunity of researchers in developing countries to participate in methodology development. Over the last decade, a programming language and software environment for statistical computing and graphics, R, is becoming the reference in open source code for a broad range of biological applications, including genetic analysis (http://www.r-project.org/). Its source code is freely available under the GNU General Public License (http://en.wikipedia.org/wiki/GNU_General_Public_License). The R language has become a de facto standard among statisticians for the development of statistical software. It compiles and runs on a wide variety of UNIX, Windows, and MacOS platforms. R is similar to other programming languages, such as C, Java, and Perl, in that it helps people perform a wide variety of computing tasks by giving them access to various commands. For statisticians, however, R is particularly useful because it contains a number of built-in modules for organizing data, running calculations on the information, and creating graphical representations of the data sets. R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, etc.) [29] and graphical techniques, and is highly extensible. Close to 1,600 different packages reside on just one of the many web sites devoted to R, and the number of packages has grown exponentially. However, R is difficult to use directly and procedures based on R must be wrapped in user-friendly menu systems if field biologists are to use them.
Information Systems
A functional IS involves far more than an analytical pipeline; it is a complete system that should include:
-
A project planning module
-
A germplasm management module
-
A robust relational database
-
Analytical standards
-
Data collection and cleaning tools
-
Analytical and decision support tools
-
Query tools
-
A cyber infrastructure (CI) that links the different tools in a cohesive and user-friendly way
Key elements of an IS are obviously the CI and the DMS as described in the following section. The value of an IS does not only reside in the quality of the individual tools or modules that are part of it, but rather in the CI or middleware that ensures cohesion across tools and efficient communication with databases.
There are not many examples of breeding ISs in the public domain. One example is the ICIS (http://www.icis.cgiar.org, [23]). ICIS is an open source IS for managing genetic resource and breeding information for any crop species. It has been developed over the last 10 years through collaboration between centers of the CGIAR, some NARS, and private companies. The ICIS system is Windows-based, and distributable on CD-ROM or via the Internet. It contains a genealogy management system (GMS , [33]) to capture and process historical genealogies as well as to maintain evolving pedigrees and to provide the basis for unique identification using internationally accepted nomenclature conventions for each crop; a seed inventory management system (IMS) ; a DMS [75] for genetic, phenotypic, and environmental data generated through evaluation and testing, as well as for providing links to genomic maps; links to geographic ISs that can manipulate all data associated with latitude and longitude (e.g., international, regional, and national testing programs); applications for maintaining, updating, and correcting genealogy records and tracking changes and updates; applications for producing field books and managing sets of breeding material, and for diagnostics such as coefficients of parentage and genetic profiles for planning crosses; tools to add new breeding methods, new data fields, and new traits; and tools for submitting data to crop curators and for distributing data updates via CD-ROM and electronic networks. The community of ICIS collaborators communicates via the ICIS Wiki (http://www.icis.cgiar.org), where all design and development decisions are documented. Feature requests and bug reports are made through the ICIS Communications project and the source code is published through various other ICIS projects on CropForge (http://cropforge.org). A commercial company, Phenome-Networks, has implemented a Web-based IS based on ICIS (http://phnserver.phenome-networks.com/).
Another system available is the Katmandoo Biosciences Data Management System (http://www.katmandoo.org/, [25]), which is a freely available, open source DMS for plant breeders developed by PI&F, NSW DPI, and DArT Pty. Ltd. It comprises linked ISs for plant breeding including applications for capturing field data using hand-held computers, barcode-based seed management systems, and databases to store and link field trial data, laboratory data, genealogical data, and marker data. A particular focus is on the use of whole-genome MM information to create graphical genotypes, track the ancestral origin of chromosomal regions, validate pedigrees, and infer missing data. It includes the applications of the Pedigree-Based Marker-Assisted Selection System (PBMASS) developed by PI&F as well as a seed management system, a digital field book for hand-held computers, and a system for directly recording weights of barcoded samples.
Both ISs struggle with the problem of integrating the different components into a single configurable system which matches the workflows of different breeding projects. Such a workflow should provide the user all tools and analytical means required to run a crop cycle: from germplasm preparation and planting, through the collection of phenotypic and the production of the genotypic data and their analysis, to the identification of genotypes to be crossed or the selection of suitable genotypes to be planted in the next cycle (Fig. 1).
In order to do this effectively, a CI is required which allows syntactic linkage between different data resources and applications.
Cyberinfrastructure and Data Management
We have referred to the revolution in Information and Communication Technology and the opportunities it presents for improving the efficiency of plant breeding . However, plant breeding is not the only area of biology being affected by this revolution and, in fact, the successful deployment of MB depends on other fields of information-intensive biology delivering knowledge (markers and methodology) to plant breeding. Even more is expected of the information and communications technology (ICT) revolution in the developing world, as it offers an opportunity for scientists there to overcome some of the constraints of isolation, the “brain drain,” and the lack of infrastructure which have prevented them from fully participating in science for development in the past [76].
It is generally recognized that upstream biology is increasingly reliant on networks of integrated information and on applications for analyzing and visualizing that information. Discipline-specific (sequence and protein databases) and model organism ISs such as Graingenes (http://wheat.pw.usda.gov/GG2/index.shtml), Gramene (http://www.gramene.org/), MaizeGDB (http://www.maizegdb.org/), and Soybase (http://www.soybase.org/) have been developed to facilitate exchanges in molecular biology and functional genomics. As noted above, plant breeding depends on these upstream sciences of molecular biology, functional genomics, and comparative biology to deliver the knowledge needed to deploy MB. The bottleneck in the overall network has been the technology needed to integrate diverse and distributed information resources, and many information scientists have been working on this problem [24, 26, 77].
One constraint to integration of scientific information is the necessity to have a standard terminology for biological concepts across species and disciplines. A successful example of such standardization is the Gene Ontology (GO) initiative (http://www.geneontology.org, [78]). Another more specialized ontology initiative, especially pertinent to agriculture, is the Plant Ontology Consortium (POC: http://www.plantontology.org, [79–81]). However, these formal descriptions remain somewhat limited to biology of model plants and controlled environments. A key challenge will be to extend such standards to describe characteristics of plants growing in the unique, stress-prone environments found within the developing world to ensure a wider impact of such standards on international agriculture. The GCP has been working with POC to expand these ontologies to economic traits and farming environments so that they can be used in the field of plant breeding [82].
Another constraint to the efficient utilization of genomic information is the sheer volume of sequence data that can now be generated very cheaply across numerous genotypes. ISs to handle this volume of information are struggling to keep up. In plant biology, some examples of systems aiming to handle these torrents of data are the Germinate database ([83], http://bioinf.scri.ac.uk/public/?page_id=159) and the Genomic Diversity and Phenotype Connection (GDPC, http://www.maizegenetics.net/gdpc/). The primary goal of Germinate is to develop a robust database which may be used for the storage and retrieval of a wide variety of data types for a broad range of plant species. Germinate focuses on genotypic, phenotypic, and passport data, but has been designed to potentially handle a much wider range of data including, but not limited to, ecogeographic, genetic diversity, pedigree, and trait data, and will permit users to query across these different types of data. The developers have aimed to provide a versatile database structure, which can be simple, requires little maintenance, may be run on a desktop computer, and yet has the potential to be scaled to a large, well-curated database running on a server. The design of Germinate provides a generic database framework from which interfaces ranging from simple to complex may be used as a gateway to the data. The data tables are structured in a way that they are able to hold information ranging from simple data associated with a single accession or plant, to complex data sets, images, and detailed text information. Features of the Germinate database structure include its ability to access any information associated with a group of accessions and to relate different types of information through their association with an accession. The GDPC database was designed as a research database to support association genetics applications such as Tassel (http://www.maizegenetics.net/index.php?option=com_content&task=view&id=89&Itemid=119) and is being extended to handle higher and higher densities of genotyping and sequence data. The second version of Germinate seems quite similar to GDPC and if new databases are developed to handle the large data files to be generated soon through high-throughput sequencing, some conversion tools should be easily developed to migrate data from one system to another.
Finally, the problem of integrating all these diverse and widely-distributed information resources is a major informatics challenge, which is being tackled on several fronts at several levels of complexity. The BioMOBY project ([84], http://www.biomoby.org, [85]) and the Semantic Web seek to define standards that will allow computer programs to interpret requests for information or services, find informatics resources capable of fulfilling those requests, and return the results without the authors of the interacting software having specifically collaborated. In the private sector, solutions have been more pragmatic and Enterprise Software solutions have been developed to link data resources and applications with specific services. The iPlant Collaborative (http://www.iplantcollaborative.org/) is a National Science Foundation (NSF)-funded initiative designed to bring these Enterprise Software solutions to the biological sciences in the form of CI which can support any biological data resource and analytical application. iPlant and the GCP are collaborating on integrating plant breeding information resources and applications into the infrastructure. This will automatically link these resources to upstream biological applications using the same infrastructure such as that used by the Systems Biology Knowledgebase initiative (http://genomicscience.energy.gov/compbio/#page=news) of the US Department of Energy which will be producing knowledge needed for crop improvement.
With all the progress achieved in marker technology, software development, analytical pipelines, and DMS, it is time to provide an IS, available through a public platform, that will offer breeding programs in developed and developing countries access to modern breeding technologies, in an integrated and configurable way, to boost crop quality and productivity.
Case Study: GCP’s Integrated Breeding Platform
To fill this gap in the public sector and in particular in the arena of research for development, the GCP has been coordinating the development of the IBP (www.generationcp.org/ibp) in collaboration with scientists from ARIs, CGIAR centers, and national research programs since mid-2009. In a first phase the IBP aims at serving the needs of a set of 14 pioneer “user cases” – MB projects for eight crops in 16 developing countries in Africa and Asia. Leading scientists of those user cases help in testing the prototypes developed for the different tools of the analytical pipeline and contribute to the monitoring and evaluation of the platform development. This ensures that IBP development is driven by real breeding needs and its interface is user-friendly.
Objective of the IBP
The overall objective of the IBP project is to provide access to modern breeding technologies, breeding material, and related information and services in a centralized and functional manner to improve plant breeding efficiency in developing countries and hence facilitate the adoption of MB approaches. The short-term objective of the project (the initial phase) is to establish – through a client-centered approach – a minimum set of tools, data management infrastructure, and services to meet the needs and enhance the efficiency of the 14 user cases.
To achieve the overall objective, GCP is developing and deploying a sustainable IBP as a one-stop shop for information, analytical tools, and related services to design, implement, and analyze MB experiments. This platform should enable breeding programs in the public and private sectors to accelerate variety development for developing countries using marker technologies – from simple gene or transgene introgression to gene pyramiding and complex MARS and GWS projects. Hence IBP aims at bringing cutting-edge breeding technologies to breeding programs that are too resource-restricted to invest in the requisite genotyping and data management infrastructure and capacity on their own.
The IBP Partnerships
The primary stakeholders of the platform are plant scientists – at this time specifically breeders leading the selected MB projects of the 14 pioneer user cases. These pioneer user cases are all recently initiated marker-assisted breeding projects with specific budgets, objectives, and work plans. The needs of the projects are defining the user requirements, and hence the design and development prioritization of the different elements of the platform. In selecting the user cases, crop diversity was a primary consideration, since the platform is supposed to address the needs of a broad variety of crops. The platform’s reciprocal contribution to these breeding projects is in helping them overcome bottlenecks that would compromise final product delivery and in enhancing their overall efficiency and chances of success by providing appropriate tools and support.
The developmental phase of the IBP brings together highly regarded public research teams – institutes and individuals who have been working on the challenges of crop information management and analysis, biometrics, and quantitative genetics. This team of bioinformaticians, statisticians, and developers aims to design and develop the different elements of the platform, based on needs and priorities defined by the user cases.
A continuous dialogue between users, developers, and service providers ensures a healthy balance between having a user-driven platform on the one hand, with a reasonable degree of “technology push” on the other hand, to ensure that users are kept abreast of technological solutions they may not be aware of but that would facilitate and accelerate breeding work.
The private sector has led the application of MB approaches and utilization of MBPs. The IBP is the first public sector effort of this magnitude aimed at developing and deploying an MBP. Given that MB for complex polygenic traits, and more so MARS, is still in its infancy in the public sector, it is recognized that efficient partnerships with the major private sector transnational seed companies is a strong prerequisite for the success of the IBP project. Consultations are ongoing with leaders in MB at Limagrain, Monsanto, Pioneer-DuPont, and Syngenta. Partnership with the private sector includes mainly some technology transfer, especially for stand-alone tools, and access to human resources to advise on the development of the platform and contribute to developing new tools or implement data management. The users, tools and services, and partnership of the platform are presented in Fig. 2.
The Platform
The IBP has three broad components (see Fig. 3): a Web-based portal and helpdesk, an open-source IS incorporating an adaptable breeding workflow system, and breeding and support services.
The stepwise development of the breeding workflow includes: (1) access to existing tools, (2) development of stand-alone new tools or adapted versions of existing tools to address the needs of the user cases, and (3) the integration of those tools into a CI (collaboration with the iPlant initiative) or through a thin middleware linking with local database to form a user-friendly configurable workflow system (CWS). A first version of the CWS, including an adequate set of tools, should be available by mid-2012, with full unfettered access scheduled for 2014.
Component 1: The Integrated Breeding Portal and Helpdesk
Inaugurated by mid-2011, the portal is the online gateway through which users access all the tools and services of the IBP. Through the portal, users will select and download tools and instructions, order materials, and procure laboratory services.
The portal’s helpdesk facilitates its use and ensures access for users who cannot efficiently use the Web interface by providing the elements they need via email, compact disc, and other offline media.
Through their user-friendly networking components, the Portal and Helpdesk will stimulate the development of collaborative crop-based and discipline-based communities of practice (CoPs) . The CoPs are expected to promote the application of MB techniques and the utilization of facilitative information management technologies, enhance data and germplasm sharing, and generally advance modern breeding capacity by linking CGIAR Centers and ARIs with developing-country breeding programs and research organizations. There is a strong hope that CoPs will facilitate and accelerate a paradigm shift to a more collaborative, outward-looking, technology-enhanced approach to breeding.
Component 2: The Information System
The IBP IS is structured as a CWS, with access to both local databases and distributed resources, such as central crop databases, molecular databases from GCP partner sites and from public initiatives such as Gramene and GrainGenes.
The Configurable Workflow System
This CWS is the operational representation of the IS and will be implemented by assembling informatics tools into applications configured to match specific breeding workflows (e.g., for MAS, MABC, or MARS; Fig. 4). The tools are organized in a series of functional modules comprising the Integrated Breeding Workbench, which is really the background structure that implements the CWS.
The IBP CWS drives the users through the different practical steps or activities of an MB project. The setup of the experiment and the germplasm management are the first steps of any project, to be followed by a set of activities that can be repeated during subsequent crop cycles, depending on the breeding objective of the experiment:
-
Germplasm evaluation
-
Genetic analysis
-
Data management
-
Data analysis, and
-
Breeding decisions
The Integrated Breeding Workbench
The workbench starts as a blank slate and the first task for the user is to open or create a project. A project manages a breeding workflow for a particular crop and a specified user. The initial sets of tools which should be available are grouped in seven modules: Administration Tools, Configuration Tools, Query Tools, and Workflow Initialization Tools (genealogy, data management, analysis, and decision support; Fig. 5).
The administration module of the workbench specifies the crop, which identifies the central (public) data resources that will be accessible to the project. This includes a central genealogy database , a central phenotype database, a public gene management database, and a central genotype database . Each installation provides access to local (private) data resources. These data resources include a private or local database for the above data types as well as a seed inventory management system. Each installation has at least one user with administrative privileges. Users are identified by authentication codes (username and password) for access to specific private data resources. (“Private” simply means “requiring authentication for access” and several users may have access to the same private data.)
The first functionality of the workbench asks the user to open a project by selecting from a list of available project configuration “files.” Once the configuration is selected, the availability of the public data resources should be checked, the user authentication codes verified, and the local data resources checked. Next, the list of modules should be reviewed and checked for availability and, depending on the state of the workflow, icons or menus should be made available for modules and tools.
The configuration tools allow users to:
-
Select or specify naming conventions for germplasm, germplasm lists, studies, etc.
-
Use and update ontologies such as germplasm methods and the trait dictionary
-
Update breeding, testing, or collection locations
-
Create and modify study templates
The query tools will depend on the data resources specified in the project configuration, and examples are:
-
A germplasm and pedigree viewer
-
A study browser to view phenotype or genotype data
-
A data miner for identifying data patterns
-
A cross-study query builder for linking different data sets
-
A gene catalog viewer for viewing genetic diversity
-
A genotype and trait viewer for visualizing graphical genotypes and trait markers
The workflow initialization tools comprise a set of modules (genealogy, data management, analysis, and decision support tools) that provide the user with a choice of different tools to achieve precise breeding objectives. Users might construct different breeding workflows to match their project activities. The user will only see the workbench tools and settings for those tools required to execute the steps in a particular breeding workflow, and at the appropriate step in that workflow.
The development of each tool is overseen by a team of IBP researchers, developers, and users who design, mock up, and prototype the tools of the breeding application and pass the specifications to a software engineering team. They will then monitor the development and test and support the application. For each application, the team develops a description of the application, functional specifications of all the tools, workflow specifications for the application, and an interface mockup. A workflow for a MARS project is shown in Fig. 6.
Component 3: IBP Services
The Services component comprises two modules. The first module, Breeding Services, provides services to conduct MB projects. The second module, Support Services, deals with training and capacity-building, aiming to provide support and improve capacity of NARS breeders to deliver improved germplasm through marker approaches – essential for the adoption of MB approaches and the MBP.
Breeding Services
These services provide access to specific germplasm, and assist with contracting a service laboratory to conduct the marker work or to quantify specific traits, such as metabolite profiles or grain quality parameters. The module has three elements (Fig. 7):
Genetic Resource Support Service : Access to suitable germplasm and related information from the different partners is a critical element of the portal. To address this, a Genetic Resource Support Service (GRSS) plans to tap into the CGIAR System-wide Genetic Resources Program (SGRP), a collaborative effort between GCP and existing gene banks in the CGIAR and NARS. The GRSS should ensure quality control, maintenance, and distribution of genetic resources, including reference sets and segregating populations acquired or generated through projects supported by GCP, and material generated from other sources and deposited with the GRSS (e.g., maize introgression lines from Syngenta).
Marker Service: The portal provides a set of online options for users to access different high-throughput marker service laboratories in the public and private sectors with clear contractual conditions. Service Laboratories have been selected on the basis of competitive cost, compliance with quality control requirements, and expeditious delivery, but are currently accessible by offline processes pending deployment of the IBP portal.
Trait and Metabolite Service: The portal provides a set of options for users to access laboratories specialized in the evaluation and analysis of specific traits, such as quality traits, pathology screening, or metabolite quantification. Analyses of certain secondary traits and metabolites that are indicative of plant stress tolerance can potentially provide valuable information to be used in breeding. Such analyses are generally prohibitively expensive if done locally, as it is difficult to maintain assay quality and devote the necessary resources for expertise, quality control, and specialized facilities.
Capacity Development and Support Services
Capacity development is an integral part of the project, encompassing training and support in using MB techniques and markers, designing breeding strategies, quality data management, information analysis and decision modeling, phenotyping protocols, and protection of intellectual property (IP).
The main objective of this set of services is therefore to provide backstopping and training in a broad set of disciplines, to complement the elements of the breeding services and address specific technical and logistical bottlenecks. Such expert assistance is essential for the adoption and proper use of new technologies. Services that will be available include:
Breeding plan development : It is essential to develop a breeding plan with a cost–benefit analysis before conducting a multi-cycle MB project. Depending on the nature of the experiment, such a plan may be quite simple or very elaborate, from the transfer of a single region (e.g., transgene) to complex selection that can consider the simultaneous transfer of dozens of regions. The critical factor is that the plan must detail all the activities over time, and the costs and benefits of the project to determine if it is worthwhile conducting the experiment. The platform provides templates and associated cost calculation sheets for different breeding schemes.
Information management: Under this service, assistance is provided in installing and parameterizing the platform IS for use by specific breeding projects.
Data curation: This service assists with capturing and curating current data for particular breeding projects, and in entering them into the integrated IS. This step is absolutely critical for quality control and further sharing of the information, and a contact person for each of the pioneer user cases has been identified to ensure good communication between the platform and the users.
Design and analysis: This service provides support on statistics, bioinformatics, quantitative genetics, and molecular biology . It includes training in data generation, handling, processing, and interpretation, as well as experimental design from field planting to MAS and MABC schemes. It provides assistance with the “translation” of the molecular context to the breeding context, and it will ensure that the methodology developed for the design and analysis of breeding trials is rapidly available to the users.
Phenotyping sites and screening protocols: Through this service, users can access information on phenotyping sites, protocols, and potential collaborators to ensure that selection is carried out under appropriate biotic and abiotic stresses and that the adaptation of germplasm is well characterized. Characterization of phenotypic sites includes geographical information, meteorological historical data, soil composition, and field infrastructure.
Genotyping Support Service (GSS) : The GSS aims to facilitate access by developing country national agricultural research institutes to genotyping technologies, and bridge the gap between lab and field research. This service provides financial and technical support for NARS breeders to access cost-efficient genotyping services worldwide and supports training activities in experimental design and data analysis for MB projects.
Intellectual property (IP) and policy: This service provides support on IP rights and freedom to operate in the arena of biotechnology and germplasm use. The service is currently being provided on an experimental basis through a virtual IP Helpdesk hosted by the GCP web site at http://www.generationcp.org/iphelpdesk.php.
Integrated Breeding Hubs
If today few question the usefulness of local basic laboratories, it is also generally accepted that large-scale genotyping activities are best outsourced to cost-effective, high-throughput service laboratories, irrespective of location. Following that rationale, the IBP provides access to marker service laboratories as the main avenue to generate the large amount of genotyping data that will be necessary to support the extensive MABC programs of the future, starting with the user cases, but the GCP also recognizes the need to provide breeders in developing countries with access to some regional hubs. At the beginning of the project four regional hubs are envisioned, covering the needs of the Americas – Centro Internacional de Agricultura Tropical (CIAT, www.ciat.cigiar.org); Africa – BioSciences eastern and central Africa (BecA, http://hub.africabiosciences.org); South Asia – International Crops Research Institute for the Semi-Arid Tropics (ICRISAT, www.icrisat.org); and South East Asia – International Rice Research Institute (IRRI, www.irri.org).
These regional hubs are expected to provide the following services:
-
In-house hands-on training (different formats are possible from short- to medium-length periods), with the objective of exposing scientists to new technologies and their applications to breeding.
-
Training courses for selected groups of researchers, targeting basic knowledge of marker technologies and their applications, as well as data analysis. These courses can be used for the testing and validation of learning materials, which will then be continuously upgraded.
-
Facilitation of small genomic and genotyping projects led by national programs, academia, and small and medium enterprises (SMEs).
-
Marker services for “small” and “orphan” crops that do not have mass demand from breeding programs and would therefore not benefit from large service providers, due to the lack of availability of SNP markers and the need to use lower-throughput SSR or other markers that can more easily be handled in lower-tech laboratories.
The Genomics and Molecular Breeding Hubs should help raise the visibility of the IBP and thus help promote the adoption of MB. Collaboration between the IBP and the regional hubs is anticipated to occur through sharing information, guiding users to apply for the appropriate service, organizing training events, and planning other developments of common interest.
Scope and Potential for Molecular Breeding Platforms
Gaps Across Countries and Crops
The application of MB approaches is now routine in developed countries, as is the integration of facilitative information and communication technologies, which are critical given the immense volumes of data necessary for, and generated by, these breeding processes. However, the situation is very different in developing countries, where MB is still far from routine in its application in breeding programs, particularly in Africa. This is especially critical due to the monumental and urgent imperative to rapidly achieve food security and improve livelihoods for a rapidly growing population through breeding for biotic stresses (including weeds, pests, and diseases) and abiotic stresses (including physical soil degradation, nitrogen deficiency, drought, heat, cold, and salinity) – conditions that make accurate phenotyping challenging. Fortunately, the history of modern breeding in developing countries is comparatively short, allowing a larger potential for crop improvement relative to the genetic gains that can be obtained at this time in developed countries, in which extensive breeding has been applied to crops for a longer time.
To address these issues, the capacity of national research institutions in terms of funds, infrastructure and expertise is directly related to the strength of their national economies [86]. This is reflected in the sharp differences in the capacity to conduct and apply biotechnology research as observed across developing countries (FAOBioDeC, http://www.fao.org/biotech/inventory_admin/dep/default.asp), and by the same token in their capacity to establish and/or utilize MBPs. The result is a three-tier typology of developing countries, directly attributable to the level of each country’s investment in agricultural R&D [87].
Tier-1 countries, comprising newly industrialized countries (NICs) such as Brazil, China, India, Mexico, South Africa, and Thailand, substantially invest in technology and R&D and are self-reliant in most aspects of marker technologies [88, 89]. These countries have the simultaneous potential to effectively adopt, adapt, and apply information and communication technologies to enhance research efficiency and outputs. They are therefore naturally at the vanguard in adopting MBPs.
Mid-level developing world economies (tier-2) such as Colombia, Indonesia, Kenya, Morocco, Uruguay, and Vietnam are well aware of MB’s importance, and some effectively apply marker technologies for germplasm characterization [90–93] and selection of major genes [94–99]. These countries have a matching potential for a limited utilization of MBPs, a potential that can be enhanced fairly rapidly in the medium to long term.
Low-level developing world economies (tier-3 countries) are struggling to sustain even basic conventional breeding. They have very limited or no application of MB approaches and are unlikely to adopt MBPs except in the long term.
Especially for tier-3 countries, resource-limited breeding programs in many developing countries are severely hampered by a shortage of well-trained personnel, low level of research funding, inadequate access to high-throughput genotyping capacity, poor and inadequate phenotyping infrastructure, lack of ISs and appropriate analysis tools, and by the logistical difficulty of integrating new approaches with traditional breeding methodologies – including problems of scale when scaling up from small to large breeding programs.
Until recently, the scarcity of available genomic resources for clonally propagated crops, for some neglected cereals such as millet, and for less-studied crops such as most tropical legumes, which are all very important crops in developing countries, represented a further constraint to agricultural research for development [100], thereby limiting the application of molecular approaches and hence the potential for MBPs. However, the recent emergence of affordable large-scale marker technologies (e.g., DArT [101]), the sharp decline of sequencing costs boosting marker development based on sequence information [102], and the explicit efforts of national agricultural research programs (e.g., India [103]) and international initiatives such as GCP [104]) have all resulted in a significant increase in the number of genomic resources available for less-studied crops. As a result, most key crops in developing countries now have adequate genomic resources for meaningful genetic studies and most MB applications.
Similarly, international efforts such as GCP’s IBP are designed to help overcome the challenges of developing-country breeders – exploiting economies of scale by making available convenient and cost-effective collective access to cutting-edge breeding technologies and informatics hitherto unavailable to them, including genomic resources, advanced laboratory services, and robust analytical and data management tools. Together, this increasing availability of genomic resources and tools for previously neglected but important crops and the access to initiatives targeting the resource-challenged NARS of the developing world will hasten the adoption of MBPs for these countries.
Institutional, Governmental, and Public Support
While corporate and other proprietary MBPs need only meet the specific requirements of a particular corporation or of specific paying clients, the development of platforms targeted at breeding programs in the developing world require a broad consensus among the parties that would use them and support them from multiple overseeing organizations. This is because these platforms are built on the premise of minimizing costs and maximizing benefits through economies of scale generated through collective access by multiple partners.
The public-access MBPs would therefore be critically dependent on well-structured MB programs, which may not be a reality in many developing countries. A good structure would entail compliance with common or compatible:
-
Good field infrastructure, including meteo station
-
Good agronomical practices at experimental stations
-
Crop ontology information system
-
Data collection, management, and analysis protocols
-
Breeding plan design
-
Information and communication technology infrastructure
-
Informatics tools for analysis, decision support purposes, and eventually modeling and simulation
Traditionally, developing world breeding programs have largely been poorly funded and poorly supported, and have been primarily driven by donor organizations [105, 106]. The lack of in-country support has often limited the dependent breeding activities to no more than a basic level. Under such circumstances, it was unrealistic to anticipate the adoption of new biotechnologies – including the utilization of MBPs. Fortunately, this scenario is changing. In 2003, through the Comprehensive Africa Agriculture Development Programme (CAADP, http://www.caadp.net/implementingcaadp-agenda.php), African governments committed to invest more in food security and in agriculture-led growth. Since then, many countries in Africa and elsewhere have developed comprehensive agricultural development strategies.
There is also a growing participation by foundations and nongovernmental organizations, and more recently the emergence of public–private sector partnerships (e.g., US Global Food Security Plan, http://www.state.gov/s/globalfoodsecurity/129952.htm). This governmental and institutional commitment is critical for the adoption of biotechnologies in general [8, 107] and for MB adoption in tier-2 countries in particular, with the attendant establishment and utilization of MBPs.
Challenges, Risks, and Opportunities
Challenges hampering the potential of MBPs in developing countries include both factors applicable generally to MB and those specific to MBPs. These factors encompass infrastructure capacity, human resource, and operational and policy issues. But amidst the challenges there are also actual and potential opportunities.
Human Capacity
Human capacity for MB technologies in developing countries is a challenge, and limitations include substandard agriculture programs at universities; difficulties in keeping up to date with relevant developments, including failures by others; poor technical skills in core disciplines; isolation as a result of insufficient peer critical mass in the workplace; and poor incentives to attract and retain scientists, resulting in brain drain and staff turnover [108].
To partially offset the undesirable trend of losing the “champions” and to “generate” more “champions,” novel international initiatives like Alliance for a Green Revolution in Africa (AGRA) support high-quality education in the South. Examples include the African Centre for Crop Improvement (ACCI, http://www.acci.org.za/) based at the University of KwaZulu–Natal in South Africa and the University of Ghana-based West African Centre for Crop Improvement (WACCI, http://www.wacci.edu.gh/). Both institutes offer doctorate degrees in modern breeding to African students, with the fieldwork component being carried out in the students’ home countries.
While obtaining their Ph.D. in plant breeding, these scientists study the principles of marker technologies, equipping them to undertake MB activities. To retain this much-needed expertise in Africa, the WACCI and ACCI programs also provide post-Ph.D. funds for these scientists to conduct research in their home countries and, in some cases, provide matching funds for their career advancement.
Precise Phenotyping
There can be no successful MB program without precise phenotyping of the target traits. Reliable phenotypic data is a must for good genetic studies [109] and most developing countries lack suitable field infrastructure for good trials and collection of accurate phenotypic data. As part of the services of a good MBP, guidelines on best practice must be provided on how to design and run a trial and conduct precise phenotyping for genetic studies under different target environments. Improving access to homogeneous field areas, and paying attention to good soil preparation and homogeneous sowing are critical. The development of new geographic IS tools [102, 110], experimental designs, phenotyping methodologies [111, 112], and advanced statistical methods [113] will facilitate the understanding of the genetic basis of complex traits [114] and of genotype-by-environment (G×E) interactions [48, 115]. Improving phenotyping infrastructure in developing countries must thus be a top priority to promote modern breeding and utilization of MBPs [106].
Laboratories for Markers Services
Genotyping can be expensive when it is performed in small laboratories using labor-intensive and low-throughput markers such as SSRs. This has traditionally limited the use of MMs in developing countries beyond the fingerprinting of germplasm with a small number of markers or the use of MAS for a few key traits. Operational efficiency is also vital, because fundamental timelines must be respected to ensure that no crop cycle is lost. Indeed, at every selection cycle, a service laboratory may have only a few weeks (time between DNA being extracted from leaves harvested on plantlets and the flowering time) to conduct the analysis and return the data to the breeders to enable them to conduct appropriate crosses among selected genotypes.
There is general agreement today that basic local laboratories at national and regional levels can be useful at least to service small local needs such as fingerprinting of limited number of accessions, GMO detection or MAS for specific traits, or for teaching and training purposes. It is also generally accepted that large-scale genotyping activities are best outsourced to advanced, modern, cost-effective high-throughput service laboratories, irrespective of the original location of the needs. This outsourcing is driven by the evolution in marker technologies. The advent of SNP genotyping led the shift from the low-throughput, primarily manual world of SSRs to high-throughput platforms powered by robotics and automated scoring, better handled by dedicated service laboratories [102, 116, 117]. As a result, genotyping costs have decreased by up to tenfold while data throughput has increased by the same magnitude. An example for MARS is provided in Fig. 6. SNP markers are increasingly available for most mainstream crops and for several less-studied crops [118, 119], which are important in developing countries.
A particular effort will be needed to ensure an easy and reliable way to track samples from the field to the laboratory, and back to the field – it will hence be vital to carefully identify DNA samples from material collected in the field. Such documentation should optimally be through bar-coding, and all information pertaining to management of field trials or experiments should be recorded in electronic field books. Marker work would of necessity be subcontracted to a service lab with a good and preferably platform-compatible laboratory information management system (LIMS).
Data Management
For breeders to efficiently access relevant information generated by themselves and by other researchers, reliable data management (including sample tracking, data collection and storage, and modern analytical methodologies and tools for accurate decision making, among others) is critical both within a given MB program and across programs. In view of this, it is essential that breeders manage pedigree, phenotypic, and genotypic information through common or mutually compatible crop databases, in keeping with the collective access principle of a public MBP. The format of databases would need to be user-friendly and compatible with field data collection devices and applications to encourage both adoption and compliance. Ultimately, data collection and management processes would need to seamlessly link with a platform-resident analysis, modeling, simulation, and a decision support workbench for full utility of the breeding platform.
Paradigm Shift: Collaborative Work and Data Sharing
Access to information and products generated by fellow users is a potentially critical incentive for breeders to use the platform and share their own data with other users. However, this would require a fundamental paradigm shift from the present data-hoarding, inward-looking approach to research common to breeders. This may, however, only be achievable if it is a clear requirement in the terms of engagement for membership of a “platform community,” or if distinct financial and other incentives are offered for such sharing.
Technology-Push Versus Demand-Driven
An MBP is by nature a high-level technological solution. It carries with it the inherent risk of failing to address fundamental practical problems of developing-world breeding programs, which will often by nature be technology-deficient. Such platforms therefore face the challenge of ensuring that they meet targeted user objectives and address practical constraints.
However, with this challenge comes an opportunity to introduce advanced MB methodologies to developing world breeders, by encouraging change that will enable them to take advantage of the efficiencies and economies of scale offered by the MBP. This opportunity would be particularly reachable with bottom-up platform design and development that actively engages and involves the breeders – including elements of human resource capacity development and support in usage.
Adoption and Use by Breeders
An MBP would only make a difference if it is adopted and widely used by the breeders. The most important element influencing this would be credibility – a function of the quality of the technology, the awareness of potential users, the ease of access, and initial incentives. There is a need for successful public sector developing-country examples to demonstrate that the platform can effectively enhance the efficiency of breeders through the use of modern approaches – a clear demonstration of the added value of using the platform.
Sustainability of the Platform
Sustainability would be a challenge for MBPs targeting developing world breeding programs, given their resource limitations. These programs may not be able to meet the full cost of platform usage, and the cost of maintaining and updating the different elements of the platform on a regular basis – particularly tools and facilities that must keep abreast with evolving information and communication technologies.
Of course, platform sustainability is directly linked to its adoption by breeders, and sustainability strategies must be adapted to the diversity and financial resources of the potential clients, from developing-world national agricultural research institutes with limited resources to SMEs. Service costs might also be adjusted if clients are willing to share data and release germplasm through the platform.
Platform managers may also have to consider other innovative options like on-platform advertising by agriculture-related commercial enterprises. However, ongoing donor support would most likely still be required in the medium to long term.
Communities of Practice
The development of platform-based MB communities of practice, to connect groups of crop researchers, mainly breeders, willing to share experiences and information on modern breeding methods, best field practices, and development of improved varieties, and to practice peer-to-peer mentoring, are an additional potential avenue for platform adoption and sustainability, besides providing means to quickly and efficiently resolve recurring breeding problems. Partnerships between developed and developing-country institutions, and between the private and public sectors, are also an opportunity for realizing the full potential of MB [87, 108].
Many other hurdles limit successful public sector utilization of MB opportunities [120, 121]. However, the potential of virtual MBPs made possible by the revolution in information and communication technologies provides opportunities to counter and overcome many of those shortcomings.
Potential Economic Impact of Molecular Breeding Platforms
By its nature, MB improves the efficiency of crop breeding – progressively increasing genetic gains by selecting and stacking favorable alleles at target loci. The utilization of MBPs accelerates and amplifies the advantages of MB by introducing significant efficiencies in resource and time usage. Predictive or designer breeding, which would be the ultimate result of information-rich MB, attainable through the use of MBPs by numerous different breeding programs that freely share data and germplasm, would particularly bring about these savings in resources and time.
However, a direct comparison of the cost-effectiveness of MB with phenotypic selection is not straightforward. Firstly, factors other than cost – such as trade-offs between time and money – play an important role in determining the selection method. Secondly, this choice is further complicated by the fact that the two methods are rarely mutually exclusive or direct substitutes for each other [122]. On the contrary, under most breeding schemes, they are in fact complementary. Where operating capital is not a limitation, MB maximizes the net present value, especially when strengthened through MBPs [123]. With the increasing ease of accessing marker service laboratories and the declining cost per marker data point, MB costs are shrinking, making it extremely attractive from a purely economic perspective.
However, once the technological hurdles are overcome, the ultimate impact of new technologies (such as MBPs) is often limited by the lack of, or ineffective, seed distribution systems or by distant markets. SMEs are critical in promoting access to, and distribution of, improved seeds, thus helping alleviate a major bottleneck to the impact of improved breeding on smallholder farmers [124, 125].
Few economic analyses have been conducted to objectively assess the potential impacts of MB in the public sector, and none for MBPs that are just now emerging as a tool for breeding in the public sector.
Of the few analyses done to date, one evaluates the economic benefits of MABC using preexisting MMs in developing rice varieties tolerant to salinity and P-deficiency [126] in Bangladesh, India, Indonesia, and the Philippines. Encompassing a broad set of economic parameters, the study concluded that MABC saves an estimated minimum of 2–3 years, resulting in significant incremental benefits in the range of USD 300–800 million depending on the country, the extent of abiotic stress encountered, and the lag for conventional breeding [127].
Future studies are likely to confirm the positive economic benefits of MB and, given that MBPs amplify the benefits of MB, it can be reasonably inferred that the emerging platforms would indeed further enhance those economic benefits.
Future Directions
MBPs will inevitably have a significant impact on crop breeding in developing countries in the medium to long term because of:
-
The needs-driven demand for improved crop varieties to counter the global food crisis
-
The exponential development of genomic resources
-
The ever-declining cost of marker technologies
-
The increasing occurrence of public–private partnerships, where the public sector can learn from private companies about best practices for integrating MB into their breeding programs
-
The need for innovative solutions to the challenges of resource and operational limitations
The first challenge of MBPs will be to meet the immediate needs of the breeders in developing-country public and private programs. The first step will be to provide them with the tools for enhancement of their current breeding programs, through the implementation of field books, pedigree management, and basic statistical analytical tools necessary to optimally conduct their current breeding efforts. In close succession with these first applications, tools will need to be made available to facilitate the integration of MB into their breeding programs. Databases will need to be developed for storing genotypic and phenotypic data, integrated analytical tools will need to be made available to breeders for analysis of this accumulated data and for the identification of important simple trait loci or QTLs to monitor and recombine in their breeding programs, and decision support tools will need to be developed to help breeders decide on the next steps to engage in based on the data they generated from their MB activities.
In the near future, more complex tools will need to be developed for the storage and analysis of the large amounts of genotypic data that will be generated by new next-generation sequencing technologies and for their application in GWS. A tight linkage will also have to be established with the wealth of information that is being generated and will continue to be generated even faster in the genomics area, leading to the dissection of the genome and to the discovery of the location and function of major genes having an impact upon the performance of crops in environments relevant to developing-country programs.
Eventually, the accumulation of large amounts of genetic information linked to specific haplotypes will lead to the increasing use of predictive breeding in combination with traditional MB usage and appropriate tools will also need to be developed to support those efforts.
Although it is critical for a platform to anticipate all the new possible features of MB, ensuring that new technologies and ISs will find their way in a flexible infrastructure, it is also quite probable that most of the breeding programs in developing countries will work at the short- and mid-term mainly with simple MB approaches as they will never reach the critical size of crosses and germplasm evaluation requested to maximize complex approaches.
Conclusion and Prospective Scenarios
Through international initiatives like the ones coordinated by the CGIAR centers and programs, several notable developing-world MB successes have already been reported.
A well-known example is the development of submergence-tolerant rice cultivars through MABC led by IRRI [128]. The introgression of the Sub1 gene from FR13A (the world’s most flood-tolerant variety) into widely grown varieties like Swarna improved yields in more than 15 million hectares of rain-fed lowland rice in South and Southeast Asia.
MB in general and the use of MBPs in particular have definitely been shown to be an efficient approach for reducing the number of required selection cycles and for increasing the genetic gain per crop cycle to a point where the required human and operational resources can be kept to a minimum.
However, for sustainable adoption, the use of modern breeding strategies requires a breeder-led bottom-up approach. As a start, simple MB approaches adapted to local environments should be tested first by individual breeders to evaluate their success and impact under those breeders’ conditions. Once proven, these approaches can then be implemented more widely or integrated to an MBP for enhanced efficiency. In case of individual success the adoption of MB by those breeders should be quite straightforward.
It is clear that the extent, speed, and scope of adoption of MB approaches and of utilization of MBPs will vary somewhat across tier-1, tier-2, and tier-3 countries, depending on the local priorities and on the resources available in given breeding programs. It is unrealistic to expect that large-scale MB breeding activities, including utilization of MBPs, will be widely implemented across the board in developing countries in the near term. However, the prospects are bright for individual breeders in these countries (particularly in tiers 1 and 2) to access germplasm, data, tools, and methodology that will allow them to conduct efficient MB projects by taking advantage of large international initiatives specifically targeting developing-country breeding programs. This will, however, happen in different ways and on different timelines for each tier.
For tier-1 countries, the impact would be evident in the shorter term – say in 3–6 years. These countries will benefit from new tools and platforms by increasing the rate of MB adoption. The biggest change is likely to occur in tier-2 countries, as these countries would be starting MB from scratch, but the impact would realistically be measurable only in the medium term, meaning in about a decade from now. For countries currently in tier-3 to advance to tier-2, basic breeding programs must first be established, which is highly dependent on governmental priorities and on subsequent resource allocation.
All in all, implementing MB (and catalyzing and accelerating its impact through MBPs) will boost crop production, which will translate into higher farm productivity per unit of land, better nutrition, higher incomes, poverty alleviation, and ultimately improved livelihoods in developing countries (Fig. 8). These gains will be amplified by sustained use, by continuously improving expertise, and by growth and development of homegrown capacity for the application of advanced breeding approaches.
Abbreviations
- Analytical pipeline:
-
A sequence of data management and statistical analysis algorithms which can be applied to one or more data sets to produce a result which can be interpreted and applied in decision making.
- Capacity building:
-
Assistance that is provided to entities, usually institutions in developing countries, which have a need to develop a certain skill or competence, or for general upgrading of capability.
- Cyberinfrastructure (CI):
-
Computer-based research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization, and other computing and information processing services over the Internet. In scientific usage, CI is a technological solution to the problem of efficiently connecting data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge.
- Gene:
-
Segment of DNA specifying a unit of genetic information; an ordered sequence of nucleotide base pairs that produce a certain product that has a specific function.
- Information system (IS):
-
An integrated set of computing components and human activities for collecting, storing, processing, and communicating information.
- Integrated breeding platform (IBP):
-
Term to describe a Molecular Breeding Platform (see below) in a broader sense including the availability of tools and services suitable for conventional breeding based on phenotypic selection only.
- Molecular breeding (MB):
-
Identification, evaluation, and stacking of useful alleles for agronomic traits of importance using molecular markers (MMs) in breeding programs. MB encompasses several modern breeding strategies, such as marker-assisted selection (MAS), marker-assisted backcrossing (MABC), marker-assisted recurrent selection (MARS), and genome-wide selection (GWS).
- Molecular breeding platform (MBP):
-
A term that has come to indicate a virtual platform driven by modern information and communication technologies through which MB programs can access genomic resources, advanced laboratory services, and analytical and data management tools to accelerate variety development using marker technologies.
- Plant breeding:
-
The science of improving the genetic makeup of plants in order to increase their value. Increased crop yield is the primary aim of most plant breeding programs; benefits of the hybrids and new varieties developed include adaptation to new agricultural areas, greater resistance to disease and insects, greater yield of useful parts, better nutritional content of edible parts, and greater physiological efficiency especially under abiotic stress conditions.
- Quantitative trait locus (QTL):
-
A region of the genome that contains genes affecting a quantitative trait. Though not necessarily genes themselves, QTLs are stretches of DNA that are closely linked to the genes that underlie the corresponding trait.
Bibliography
Crosbie TM, Eathington SR, Johnson GR, Edwards M, Reiter R, Stark S, Mohanty RG, Oyervides M, Buehler RE, Walker AK, Dobert R, Delannay X, Pershing JC, Hall MA, Lamkey KR (2006) Plant breeding: past, present, and future. In: Lamkey KR, Lee M (eds) Plant breeding: the Arnel R. Hallauer international symposium. Blackwell, Ames, pp 3–50
Falck-Zepeda J, Zambrano P, Cohen JI, Borges O, Guimarães EP, Hautea D, Kengue J, Songa J (2008) Plant genetic resources for agriculture, plant breeding, and biotechnology. EPTD Discussion Paper 00762. International Food Policy Research Institute, Washington, DC
Goodman RM, Hauptli H, Crossway A, Knauf VC (1987) Gene transfer in crop improvement. Science 236:48–54
Cooper M, Smith OS, Merrill RE, Arthur L, Polich DW, Loffler CM (2006) Integrating breeding tools to generate information for efficient breeding: past, present, and future. In: Lamkey KR, Lee MA (eds) Plant breeding: the Arnel R. Hallauer international symposium. Blackwell, Ames, pp 141–154
Tanksley SD, Young ND, Paterson AH, Bonierbale MW (1989) RFLP mapping in plant breeding: new tools for an old science. Biotechnology 7:257–264
Ribaut J-M, Hoisington DA (1998) Marker-assisted selection: new tools and strategies. Trends Plant Sci 3:236–239
Bernardo R (2008) Molecular markers and selection for complex traits in plants: Learning from the last 20 years. Crop Sci 48:1649–1664
Moose SP, Mumm RH (2008) Molecular plant breeding as the foundation for 21st century crop improvement. Plant Phys 147:969–977
Wang S, Basten CJ, Zeng Z-B (2005) Windows QTL Cartographer 2.5. Department of Statistics, North Carolina State University, Raleigh
Collard BCY, Mackill DJ (2008) Marker-assisted selection: an approach for precision plant breeding in the twenty-first century. Philos Trans R Soc B 363:557–572
Ribaut J-M, Jiang C, Hoisington D (2002) Efficiency of a gene introgression experiment by backcrossing. Crop Sci 42:557–565
Mumm RH (2007) Backcross versus forward breeding in the development of transgenic maize hybrids: theory and practice. Crop Sci 47(S3):S164–S171
Hospital F, Charcosset A (1997) Marker-assisted introgression of quantitative trait loci. Genetics 147:1469–1485
Stam P (1995) Marker-assisted breeding. In: Van Ooijen JW, Jansen J (eds) Biometrics in plant breeding: applications of molecular markers. Proceedings of the ninth meeting of the EUCARPIA section biometrics in plant breeding, CPRO-DLO, Wageningen, pp 32–44
Peleman JD, Van Der Voort JR (2003) Breeding by design. Trends Plant Sci 7:330–334
Johnson R (2004) Marker-assisted selection. Plant Breed Rev 24:293–309
Bernardo R, Charcosset A (2006) Usefulness of gene information in marker-assisted recurrent selection: a simulation appraisal. Crop Sci 46:614–662
Guttmacher AE, Collins FS (2002) Genomic medicine – a primer. N Engl J Med 347:1512–1520
de los Campos G, Gianola D, Allison DB (2010) Predicting genetic predisposition in humans: the promise of whole-genome markers. Nat Rev Genet 11:880–886. doi:10.1038/nrg2898
Goddard ME, Hayes BJ (2007) Genomic selection. J Anim Breed Genet 124:323–330
Tinker NA, Yan W (2006) Information systems for crop performance data. Can J Plant Sci 86:647–662
Yan W, Tinker NA (2007) DUDE: a user-friendly crop information system. Agron J 99:1029–1033
McLaren CG, Bruskiewich RM, Portugal AM, Cosico B (2005) The international rice information system. A platform for meta-analysis of rice crop data. Plant Physiol 139:637–642
Bruskiewich R, Senger M, Davenport G, Ruiz M, Rouard M, Hazekamp T, Takeya M, Doi K, Satoh K, Costa M, Simon R, Balaji J, Akintunde A, Mauleon R, Wanchana S, Shah T, Anacleto M, Portugal A, Ulat VJ, Thongjuea S, Braak K, Ritter S, Dereeper A, Skofic M, Rojas E, Martins N, Pappas G, Alamban R, Almodiel R, Barboza LH, Detras J, Manansala K, Mendoza MJ, Morales J, Peralta B, Valerio R, Zhang Y, Gregorio S, Hermocilla J, Echavez M, Yap JM, Farmer SA, Gary, Lee J, Casstevens T, Jaiswal P, Meintjes A, Wilkinson M, Good B, Wagner J, Morris J, Marshall D, Collins A, Kikuchi S, Metz T, McLaren G, van Hintum T (2008) The Generation Challenge Programme platform: semantic standards and workbench for crop science. J Plant Genom 2008, Article ID 369601, 6 p. doi: 10.1155/2008/369601
Rodgers D, Jordan D (2009) Information management systems for plant breeders. Primary Industries and Fisheries (PI&F) of the Queensland Government, Department of Employment, Economic Development and Innovation in Australia, Queensland, Australia
Gudmundur A, Thorisson JM, Brookes AJ (2009) Genotype–phenotype databases: challenges and solutions for the post-genomic era. Nat Rev 10:9–18
Smith A, Cullis B, Thompson R (2001) Analyzing variety by environment data using multiplicative mixed models and adjustments for spatial field trend. Biometrics 57:1138–1147
Burgueño J, Crossa J, Cornelius PL, Trethowan R, McLaren G, Krishnamachari A (2007) Modeling additive × environment and additive × additive × environment using genetic covariances of relatives of wheat genotypes. Crop Sci 47:311–320
Butler D, Cullis BR, Gilmour AR, Gogel BJ (2007) ASReml reference manual, release 2.00. VSN, Hemel Hempstead
Hammer G, Cooper M, Tardieu F, Welch S, Walsh B, van Eeuwijk F, Chapman S, Podlich D (2006) Models for navigating biological complexity in breeding improved crop plants. Trends Plant Sci 11:587–593
Wang J, Chapman SC, Bonnett DG, Rebetzke GJ, Crouch J (2007) Application of population genetic theory and simulation models to efficiently pyramid multiple genes via marker-assisted selection. Crop Sci 47:580–588
Chapman S (2008) Use of crop models to understand genotype by environment interactions for drought in real-world and simulated plant breeding trials. Euphytica 161:195–208
DeLacy IH, Fox PN, McLaren G, Trethowan R, White JW (2009) A conceptual model for describing processes of crop improvement in database structures. Crop Sci 49:2100–2112
Crossa J, Burgueño J, Dreisigacker S, Vargas M, Herrera S, Lillemo M, Singh RP, Trethowan R, Franco J, Warburton M, Reynolds M, Crouch JH, Ortiz R (2007) Association analysis of historical bread wheat germplasm using additive genetic covariance of relatives and population structure. Genetics 177:1889–1913
Lee M (1995) DNA markers and plant breeding programs. Adv Agron 55:265–344
Helentjaris T, Slocum M, Wright S, Schaefer A, Nienhuis J (1986) Construction of genetic linkage maps in maize and tomato using restriction fragment length polymorphisms. Theor Appl Genet 72:761–769
Edwards MD, Stuber CW, Wendel JF (1987) Molecular-marker-facilitated investigations of quantitative-trait loci in maize. I. Numbers, genomic distribution and types of gene action. Genetics 116:113–125
Paterson AH, Lander ES, Hewitt JD, Peterson S, Lincoln SE, Tanksley SD (1988) Resolution of quantitative traits into Mendelian factors by using a complete linkage map of restriction fragment length polymorphisms. Nature 335:721–726
Mullis K (1990) The unusual origin of the polymerase chain reaction. Sci Am 262:56–65
Senior ML, Heun M (1993) Mapping maize microsatellites and polymerase chain reaction confirmation of the target repeats using a CT primer. Genome 36:884–889
Vos P, Hogers R, Bleeker M, Reijans M, Tho L, van der Hornes M, Frijters A, Pot J, Peleman J, Kuiper M, Zabeau M (1995) AFLP: a new technique for DNA fingerprinting. Nucleic Acids Res 23:4407–4414
Gilles PN, Wu DJ, Foster CB, Dillon PJ, Chanock SJ (1999) Single nucleotide polymorphic discrimination by an electronic dot blot assay on semiconductor microchips. Nat Biotechnol 17:365–370
Eathington SR, Crosbie TM, Edwards MD, Reiter RS, Bull JK (2007) Molecular markers in commercial breeding. Crop Sci 47:154–163
Lander ES, Botstein D (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121:185–199
Borevitz J (2004) Genomic approaches to identifying quantitative trait loci: lessons from Arabidopsis thaliana. In: Cronk QCB, Whitton J, Ree RH, Taylor IEP (eds) Molecular genetics and ecology of plant adaptation. Proceedings of an international workshop, December 2002, Vancouver, NCR Research Press, Ottawa, pp 53–60
Haley CS, Knott SA (1992) A simple regression method for mapping quantitative loci in line crosses using flanking markers. Heredity 69:315–324
Martinez O, Curnow RN (1992) Estimating the locations and the sizes of the effects of quantitative trait loci using flanking markers. Theor Appl Genet 85:480–488
Malosetti M, Ribaut J-M, Vargas M, Crossa J, van Eeuwijk FA (2008) A multi-trait, multi-environment QTL mixed model with an application to drought and nitrogen trials in maize (Zea mays L.). Euphytica 161:241–257
Bink MCAM, Janss LLG, Quaas RL (2000) Markov chain Monte Carlo for mapping a quantitative trait locus in outbred populations. Genet Res 75:231–241
Bink MCAM, Boer MP, ter Braak CJF, Jansen J, Voorrips RE, van de Weg WE (2007) Bayesian analysis of complex traits in pedigreed plant populations. Euphytica 161:85–96. doi:10.1007/s10681-007-9516-1
Chardon F, Virlon B, Moreau L, Falque M, Joets J, Decousset L, Murigneux A, Charcosset A (2004) Genetic architecture of flowering time in maize as inferred from quantitative trait loci meta-analysis and synteny conservation with the rice genome. Genetics 168:2169–2185
Jiang C, Zeng ZB (1995) Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140:1111–1127
van Eeuwijk FA, Malosetti M, Boer MP (2007) Modelling the genetic basis of response curves underlying genotype x environment interaction. In: Spiertz JHJ, Struik PC, van Laar HH (eds) Scale and complexity in plant systems research. gene-plant-crop relations. Springer, Dordrecht, pp 115–126
Boer MP, Wright D, Feng L, Podlich DW, Luo L, Cooper M, van Eeuwijk FA (2007) A mixed-model quantitative trait loci (QTL) analysis for multiple-environment trial data using environmental covariables for QTL-by-environment interactions, with an example in maize. Genetics 177:1801–1813
Malosetti M, Ribaut J-M, van Eeuwijk FA (2011) The statistical analysis of multienvironment data: modelling genotype-by-environment interaction and its genetic basis. In: Drought phenotyping in crops: from theory to practice (Monneveux Philippe and Ribaut Jean-Marcel, eds). CGIAR Generation Challenge Programme, Texcoco, Mexico. In press
Zhang F, Zhai H-Q, Paterson AH, Xu J-L, Gao Y-M et al (2011) Dissecting genetic networks underlying complex phenotypes: the theoretical framework. PLoS ONE 6(1):e14541. doi:10.1371/journal.pone.0014541
Yi N, Yandell BS, Churchill GA, Allison DB, Eisen EJ, Pomp D (2005) Bayesian model selection for genome-wide epistatic quantitative trait loci analysis. Genetics 170:1333–1344
Xu S, Jia Z (2007) Genome wide analysis of epistatic effects for quantitative traits in barley. Genetics 176:611–623
Li H, Ye G, Wang J (2007) A modified algorithm for the improvement of composite interval mapping. Genetics 175:361–374
Li H, Ribaut J-M, Li Z, Wang J (2008) Inclusive composite interval mapping (ICIM) for digenic epistasis of quantitative traits in biparental populations. Theor Appl Genet 116:243–260
Kroymann J, Mitchell-Olds T (2005) Epistasis and balanced polymorphism influencing complex trait variation. Nature 435:95–98
Zeng Z-B (2005) Modeling quantitative trait loci and interpretation of models. Genetics 169:1711–1725
Kusterer B, Muminovic J, Utz HF, Piepho H-P, Barth S, Heckenberger M, Meyer RC, Altmann T, Melchinger AE (2007) Analysis of a triple testcross design with recombinant inbred lines reveals a significant role of epistasis in heterosis for biomass-related traits in Arabidopsis. Genetics 175:2009–2017
Frascaroli CEMA, Landi P, Pea G, Gianfranceschi L, Villa M, Morgante M, Pè ME (2007) Classical genetic and quantitative trait loci analyses of heterosis in a maize hybrid between two elite inbred lines. Genetics 176:625–644
Gu X-Y, Foley ME (2007) Epistatic interactions of three loci regulate flowering time under short and long daylengths in a backcross population of rice. Theor Appl Genet 114:745–754
Melchinger AE, Piepho H-P, Utz HF, Muminović J, Wegenast T, Törjék O, Altmann T, Kusterer B (2007) Genetic basis of heterosis for growth-related traits in Arabidopsis investigated by Testcross progenies of near-isogenic lines reveals a significant role of epistasis. Genetics 177:1827–1837
Landers ES, Green P, Abrahamson J, Barlow A, Daly MJ, Lincoln SE, Newburg L (1987) Mapmaker: an interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1:174–181
Jansen RC (1993) Interval mapping of multiple quantitative trait loci. Genetics 135:205–211
Ooijen V (2004) MapQTL® 5, Software for the mapping of quantitative trait loci in experimental populations. Kyazma BV, Wageningen
Zeng Z-B (1994) Precision mapping of quantitative trait loci. Genetics 136:1457–1468
Utz HF, Melchinger AE (1996) PLABQTL: a program for composite interval mapping of QTL. J Agric Genom 2:1–5. http://probe.nalusda.gov:8000/otherdocs/jqtl/jqtl1996-01/utz.html (verified 10 September 1999)
Nelson JC (1997) QGene: software for marker-based genomic analysis and breeding. Mol Breed 3:229–235
Joehanes R, Nelson JC (2008) QGene 4.0, extensible Java QTL-analysis platform. Bioinformatics 24:2788–2789
Manly KF, Olson JM (1999) Overview of QTL mapping software and introduction to Map Manager QT. Mamm Genome 10:327–334
Portugal A, Balachandra R, Metz T, Bruskiewich R, McLaren G (2007) International crop information system for germplasm data management. In: Plant bioinformatics: methods and protocols. Humana, Totowa, pp 459–471, Chapter 22
McLaren CG, Metz T, van den Berg M, Bruskiewich R, Magor NP, Shires D (2009) Informatics in agricultural research for development. Adv Agron 102:135–157
Parkhill J, Birney E, Kersey P (2010) Genomic information infrastructure after the deluge. Genome Biol 11:402
Gene Ontology Consortium (2008) The Gene Ontology project in 2008. Nucleic Acids Res 36(Database issue):D440–D444
Avraham S, Tung CW, Ilic K, Jaiswal P, Kellogg EA, McCouch S, Pujar A, Reiser L, Rhee SY, Sachs MM, Schaeffer M, Stein L, et al (2008) The plant ontology database: A community resource for plant structure and developmental stages controlled vocabulary and annotations. Nucleic Acids Res 36(Database issue): D449–D454
Ilic K, Kellogg EA, Jaiswal P, Zapata F, Stevens PF, Vincent LP, Avraham S, Reiser L, Pujar A, Sachs MM, Whitman NT, McCouch SR et al (2007) The plant structure ontology, a unified vocabulary of anatomy and morphology of a flowering plant. Plant Physiol 143(2):587–599
Plant Ontology Consortium (2002) The Plant Ontology Consortium and plant ontologies. Comp Funct Genomics 3:137–142
Bruskiewich R, Davenport G, Hazenkamp T, Metz T, Ruiz M, Simon R, Takeya M, Lee J, Senger M, McLaren G, van Hintum T (2006) The Generation Challenge Programme (GCP)—Standards for crop data. OMICS 10:215–219
Lee JM, Davenport GF, Marshall D, Ellis TH, Ambrose MJ, Dicks J, van Hintum TJ, Flavell AJ (2005) GERMINATE. A generic database for integrating genotypic and phenotypic information for plant genetic resource collections. Plant Physiol 139(2):619–631
BioMoby Consortium (2008) Interoperability with Moby 1.0—It’s better than sharing your toothbrush! Brief Bioinform 9(3):220–231. doi:10.1093/bib/bbn003
Wilkinson M, Schoof H, Ernst R, Haase D (2005) BioMOBY successfully integrates distributed heterogeneous bioinformatics web services. The PlaNet exemplar case. Plant Physiol 138:1–13
Ribaut J-M, Monneveux P, Glaszmann JC, Leung H, Van Hintum T, de Vicente C (2008) International programs and the use of modern biotechnologies for crop improvement. In: Moore P, Ming R (eds) Genomics of tropical crop plants. Springer, New York, pp 21–63
Sonnino A, Carena MJ, Guimarães EP, Baumung R, Pilling D, Rischkowsky B (2007) An assessment of the use of molecular markers in developing countries. In: Guimarães EP, Ruane J, Scherf BD, Sonnino A, Dargie JD (eds) Marker-assisted selection: Current status and future perspectives in crops, livestock, forestry and fish. FAO, Rome, pp 15–26
Huang J, Rozelle S, Pray C, Wang Q (2002) Plant biotechnology in China. Science 295:674–677
Suresh P, Devi SV, Choudhary UN (2008) Resources and priorities for plant biotechnology research in India. Curr Sci 95:1400–1402
Ghneim Herrera T, Posso Duque D, Pérez Almeida I, Torrealba Nuñez G, Pieters AJ, Martínez CP, Tohme JM (2008) Assessment of genetic diversity in Venezuelan rice cultivars using simple sequence repeats markers. Electron J Biotechnol. doi:10.2225/vol11-issue5-fulltext-6
Khadari B, Oukabli A, Ater M, Mamouni A, Roger JP, Kjellberg F (2004) Molecular characterization of Moroccan fig germplasm using intersimple sequence repeat and simple sequence repeat markers to establish a reference collection. Hortic Sci 40:29–32
Onguso JM, Kahangi EM, Ndiritu DW, Mizutani F (2004) Genetic characterization of cultivated bananas and plantains in Kenya by RAPD markers. Sci Hortic 99:9–20
Paredes M, Becerra V, González MI (2008) Low genetic diversity among garlic (Allium sativum L.) accessions detected using random amplified polymorphic DNA (RAPD). Chil J Agric Res 68:3–12
Abalo G, Tongoonaa P, Derera J, Edema R (2009) A comparative analysis of conventional and marker-assisted selection methods in breeding maize streak virus resistance in maize. Crop Sci 49:509–520
Danson JW, Mbogori M, Kimani M, Lagat M, Kuria A, Diallo A (2006) Marker-assisted introgression of opaque2 gene into herbicide-resistant elite maize inbred lines. Afr J Biotechnol 5:2417–2422
Okogbenin E, Porto MCM, Egesi C, Mba C, Espinosa E, Santos LG, Ospina C, Marin J, Barrera E, Gutierrez J et al (2007) Marker-assisted introgression of resistance to cassava mosaic disease into Latin American germplasm for the genetic improvement of cassava in Africa. Crop Sci 47:1895–1904
Leung H, Wu J, Liu B, Bustaman M, Sridhar R, Singh K, Redona E, Quang VD, Zheng K, Bernardo M et al (2004) Sustainable disease resistance in rice: current and future strategies. In: New directions for a diverse planet. Proceedings of the 4th international crop science congress, 26 September–1 October, Brisbane
Sagredo B, Mathias M, Barrientos C, Acuña I, Kalazich J, Santosrojas J (2009) Evaluation of a SCAR RYSC3 marker of the RYadg gene to select resistant genotypes to potato virus Y (PVY) in the INIA potato breeding program. Chil J Agric Res 69:305–315
Stevens R (2008) Prospects for using marker-assisted breeding to improve maize production in Africa. J Sci Food Agric. doi:10.1002/jsfa.3154
Hartwich F, Tola J, Engler A, González C, Ghezan G, Vázquez-Alvarado JMP, Silva JA, Espinoza JJ, Gottret MV (2007) Building public–private partnerships for agricultural innovation, Food security in practice technical guide series. International Food Policy Research Institute, Washington, DC
Jaccoud D, Peng K, Feinstein D, Kilian A (2001) Diversity arrays: a solid state technology for sequence information independent genotyping. Nucleic Acids Res 29:e25
Ganal MW, Altmann T, Roder M (2009) SNP identification in crop plants. Curr Opin Plant Biol 12:211–217
Varshney RK, Penmetsa RV, Dutta S, Kulwal PL, Saxena RK, Datta S, Sharma TR, Rosen B, Carrasquilla-Garcia N, Farmer A et al (2009) Pigeonpea genomics initiative (PGI): an international effort to improve crop productivity of pigeonpea (Cajanus cajan L.). Mol Breed 26:393–408. doi:10.1007/s11032-009-9327-2
Varshney RK, Close TJ, Singh NK, Hoisington DA, Cook DR (2009) Orphan legume crops enter the genomics era! Curr Opin Plant Biol 12:1–9
Ajani EN, Madukwe MC, Agwu AE, Onwubuya EA (2009) Assessment of technology generating institutions in biotechnology innovation system of South-Eastern Nigeria. Afr J Biotechnol 8:2258–2264
O’Toole JC, Toenniessen GH, Murashige T, Harris RR, Herdt RW (2001) The Rockefeller Foundation’s international program on rice biotechnology. In: Khush GS, Brar DS, Hardy B (eds) Rice genetics IV. Proceedings of the 4th international rice genetics symposium, Los Baños. International Rice Research Institute, pp 39–59
Kelemu S, Mahuku G, Fregene M, Pachico D, Johnson N, Calvert L, Rao I, Buruchara R, Amede T, Kimani P et al (2003) Harmonizing the agricultural biotechnology debate for the benefit of African farmers. Afr J Biotechnol 2:394–416
Morris M, Edmeades G, Peju E (2006) The global need for plant breeding capacity: what roles for the public and private sectors? Hortic Sci 41:30–39
Salekdeh GH, Reynolds M, Bennett J, Boyer J (2009) Conceptual framework for drought phenotyping during molecular breeding. Trends Plant Sci 14:488–496
Hyman G, Fujisaka S, Jones P, Wood S, de Vicente C, Dixon J (2008) Strategic approaches to targeting technology generation: assessing the coincidence of poverty and drought-prone crop production. Agric Syst 98:50–61
Hamer G, Cooper M, Tardieu F, Welch S, Walsh B, van Euuwijk F, Chapman S, Polish D (2006) Models for navigating biological complexity in breeding improved crop plants. Trends Plant Sci 11:587–593
Ribaut J-M, Betran J, Monneveux P, Setter T (2008) Drought tolerance in maize. In: Bennetzen J, Hake S (eds) Maize handbook, vol 1. Springer, New York, pp 311–344
Cooper M, van Eeuwijk F, Hammer GL, Podlich DW, Messina C (2009) Modeling QTL for complex traits: detection and context for plant breeding. Curr Opin Plant Biol 12:231–240
Mackay TFC, Stone EA, Ayroles JF (2009) The genetics of quantitative traits: challenges and prospects. Nat Rev Genet. doi:10.1038/nrg2612
Cooper M, van Eeuwijk FA, Chapman SC, Podlich DW, Löffler C (2006) Genotype-by-environment interactions under water-limited conditions. In: Ribaut JM (ed) Drought adaptation in cereals. Haworth, Binghampton, pp 51–95
Chagné D, Batley J, Edwards D, Forster JW (2007) Single nucleotide polymorphism genotyping in plants. In: Oraguzie NC, Rikkerink EHA, Gardiner SE, de Silva HN (eds) Association mapping in plants. Springer, New York, pp 77–94
Angaji SA (2009) Single nucleotide polymorphism genotyping and its application on mapping and marker-assisted plant breeding. Afr J Biotechnol 8:908–914
Muchero M, Diop NN, Bhat PR, Fenton RD, Wanamaker S, Pottorff M, Hearne S, Cisse N, Fatokun C, Ehlers JD et al (2009) A consensus genetic map of cowpea [Vigna unguiculata (L) Walp.] and synteny based on EST-derived SNPs. Proc Natl Acad Sci USA 106:18159–18164
Kawuki RS, Ferguson M, Labuschagne M, Herselman L, Kim DJ (2009) Identification, characterisation and application of single nucleotide polymorphisms for diversity assessment in cassava (Manihot esculenta Crantz). Mol Breed 23:669–684
Dwivedi SL, Crouch JH, Mackill DJ, Xu Y, Blair MW, Ragot M, Upadhyaya HD, Ortiz R (2007) The molecularization of public sector crop breeding: progress, problems, and prospects. Adv Agron 95:163–318. doi:10.1016/S0065-2113(07)95003-8
Xu Y, Crouch JH (2008) Marker-assisted selection in plant breeding: from publications to practice. Crop Sci 48:391–407
Dreher K, Khairallah M, Ribaut J-M, Morris M (2003) Money matters (I): costs of field and laboratory procedures associated with conventional and marker-assisted maize breeding at CIMMYT. Mol Breed 11:221–234
Morris M, Dreher K, Ribaut J-M, Khairallah M (2003) Money matters (II): costs of maize inbred line conversion schemes at CIMMYT using conventional and marker-assisted selection. Mol Breed 11:235–247
Delmer DP (2005) Agriculture in the developing world: connecting innovations in plant research to downstream applications. Proc Natl Acad Sci USA 102:15739–15746
Guimarães EP, Kueneman E, Carena MJ (2006) Assessment of national plant breeding and biotechnology capacity in Africa and recommendations for future capacity building. Hortic Sci 41:50–52
Ismail AM, Heuer S, Thomson MJ, Wissuwa M (2007) Genetic and genomic approaches to develop rice germplasm for problem soils. Plant Mol Biol 4:547–570
Alpuerto VE, Norton GW, Alwang J, Ismail AM (2009) Economic impact analysis of marker-assisted breeding for tolerance to salinity and phosphorous deficiency in rice. Rev Agr Econ 31:779–792
Septiningsih EM, Pamplona AM, Sanchez DL, Neeraja CN, Vergara GV, Heuer S, Ismail AM, Mackill DJ (2009) Development of submergence-tolerant rice cultivars: the Sub1 locus and beyond. Ann Bot 103:151–160
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this entry
Cite this entry
Ribaut, JM., Delannay, X., McLaren, G., Okono, F. (2012). Molecular Breeding Platforms in World Agriculture. In: Meyers, R.A. (eds) Encyclopedia of Sustainability Science and Technology. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-0851-3_237
Download citation
DOI: https://doi.org/10.1007/978-1-4419-0851-3_237
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-89469-0
Online ISBN: 978-1-4419-0851-3
eBook Packages: Earth and Environmental ScienceReference Module Physical and Materials ScienceReference Module Earth and Environmental Sciences