Abstract
This study aimed to assess the predictive ability of genomic selection (GS) for biomass yield of alfalfa and pea, considering different data sets, GS models, and thresholds for genotype missing data. An additional aim was to briefly devise the incorporation of GS into breeding schemes of these crops. For alfalfa, the predictive ability of best GS models ranged from r = 0.18 to r = 0.36 in three data sets. The lowest value (observed in a data set with higher experimental error or lower genetic variation relative to the other data sets) may still be of practical interest, given the long selection cycle and the low narrow-sense heritability of biomass yield in this crop. For pea biomass yield, the predictive ability of best GS models averaged r = 0.45 across three recombinant inbred line (RIL) populations. Predictions were less accurate for this trait than for pea straw or grain yield. GS is a promising approach but its adoption implies important modifications of alfalfa and pea breeding schemes. We identified five stages of GS-based selection schemes, whose implementation depends largely on the reproductive system of the target species.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Socio-economic and environmental factors point to the pivotal role of legume-based crops in future crop-livestock systems of southern Europe (Annicchiarico 2017). Alfalfa (alias lucerne, Medicago sativa L.) is the most-grown perennial forage in this region. Its genetic variation includes germplasm with outstanding drought tolerance (Annicchiarico et al. 2011), which can be exploited for improving crop adaptation to the predicted adverse effects of climate change. Regional production of hay or silage may also rely on annual legumes, particularly in severely drought-prone environments where perennials may lack sufficient persistence. Recent findings have highlighted the interest of field pea (Pisum sativum L.) over vetch species (Vicia spp.) in this respect, both as a pure stand crop and for intercropping with cereals (Annicchiarico et al. 2017b). Maximizing the aerial biomass of semi-dwarf pea germplasm has crucial importance not only for pure stands but also for pea-cereal intercropping, where it ensures sufficient legume content and competitive ability against cereals (Annicchiarico et al. 2013).
Genomic selection (GS) pools phenotyping and genotyping data of a genotype sample representing a target genetic base (reference population) into a model that estimates breeding values for future plant selection (Heffner et al. 2009). GS has taken impulse from the development of genotyping-by-sequencing (GBS) (Elshire et al. 2011), which can produce thousands of genome-wide markers at a lower cost than SNP array platforms (albeit with large amounts of missing data). While predicting pure line performance is the obvious aim of GS in inbred species (such as pea), predicting the breeding value of candidate parent genotypes for synthetic varieties of outbred species (such as alfalfa) can be pursued by genotyping a set of parent genotypes and phenotyping their half-sib progenies (Annicchiarico et al. 2015a). First results for GS prediction of alfalfa forage yield or pea grain yield were promising (Annicchiarico et al. 2015b, 2017a; Li et al. 2015). Positive results emerged as well for prediction of some grain yield components of pea (Burstin et al. 2015).
This study pooled results for different material and/or cropping conditions with the aim to assess the predictive ability of GS for biomass yield of alfalfa and pea. An additional aim was to briefly devise the incorporation of GS into the breeding scheme of these crops.
2 Material and Methods
Alfalfa genotypes were phenotyped for dry biomass yield under dense-stand conditions of their half-sib progenies in three experiments termed hereafter as data sets. Data set 1 comprised 154 genotypes from a broadly-based population of Mediterranean germplasm, phenotyped under water-favourable conditions in a managed environment (750 mm of water over March–October) over four harvests of one year. Data set 2 included the same material, phenotyped under moderate drought stress in a managed environment (on average, 455 mm of water over March–October) over seven harvests across two years and the following spring. Data set 3 included 124 parent genotypes from a broadly-based population of landrace and variety germplasm from the Po Valley, phenotyped in Lodi (northern Italy) under field conditions and moderate drought stress (on average, 454 mm of rainfall plus irrigation water over March–October) over 12 harvests across two years and the following spring. Annicchiarico et al. (2015b) described procedures of GBS and SNP data calling for these data sets, as well as phenotyping procedures and results generated by seven GS models for the first and the third data set. This study adds original results relative to the second data set, whose experiment was carried out using same procedures (for plot size, experimental design, etc.) as the first data set but different drought stress level and experiment duration. For this data set, we exploited SNP data from Annicchiarico et al. (2015b) and the two GS models that proved more predictive for yield in the other two data sets, namely, Ridge Regression BLUP (rrBLUP), and Support Vector Regression using Linear Kernel (SVR-lin).
For pea, an earlier study (Annicchiarico et al. 2017a) reported the predictive ability of four GS models for grain yield under severe managed drought stress (120 mm of water over the period March–May) of three recombinant inbred line (RIL) populations, each including 105 lines. Here, we added information on GS predictive ability for dry biomass and straw yield assessed in the same experiment. The RILs were issued by connected crosses between three semi-dwarf cultivars (Attika; Isard; Kaspa) that exhibited high and stable grain yield across climatically-contrasting Italian sites (Annicchiarico 2005; Annicchiarico and Iannucci 2008). Attika and Kaspa displayed high biomass yield too, and proved suitable for forage production in mixed cropping with cereals (Annicchiarico et al. 2013, 2017b). We used GBS-based SNP data from Annicchiarico et al. (2017a) for the cautious minimum read depth of six for SNP genotype calling (given some heterozygosity expected in the genotyped F6 generation), and adopted the two GS models that were more predictive for grain yield in that study, i.e., Bayesian Lasso (BL) and rrBLUP.
GS predictive ability (i.e., the correlation between GS-predicted values and observed values) was assessed across genotype SNP missing data thresholds for marker retention in the range 10–50%, using missing data imputation and cross-validation procedures described earlier (Annicchiarico et al. 2015b, 2017a). Pea GS models were trained over the three RIL populations without imputing genetic structure information, assessing their predicting ability on the single populations.
3 Results and Discussion
For unpublished results of alfalfa (second data set), the best GS configuration for predicting biomass yield was provided by the SVR-lin model with genotype missing data threshold of 40%, whose predictive ability reached r = 0.18 using 10911 polymorphic SNP markers. The rrBLUP model performed nearly as well (r = 0.17). Best predictions for this data set were distinctly lower than those observed for biomass yield of the same material under favourable cropping conditions (which featured distinctly lower experiment error CV, i.e., 14.1 vs 19.8%), or yield of a different reference population under moderate drought stress (which featured distinctly higher genetic variance CV, i.e., 22.8 vs 14.0%) (Table 1). It should be noted that even r = 0.18, although fairly unsatisfactory, could still provide a sizable advantage for GS over half-sib progeny based phenotypic selection in terms of predicted yield gains per unit time (Annicchiarico et al. 2015a, b). This would descend from assuming one year for each GS selection cycle and five years for each progeny-based phenotypic selection cycle, with narrow-sense heritability in the range 0.15–0.30 for biomass yield as indicated by various studies (Annicchiarico 2015).
According to pea GS results averaged across the three RIL populations, best predictions of aerial biomass were provided by the BL model with genotype missing rate of 30%. This configuration achieved moderately high predictive ability, i.e., r = 0.45, using 1537 polymorphic SNP markers over the three populations. However, predictive ability values were nearly identical in the range 20–50% of genotype missing data for the two GS models (data not shown). Best predictions for the single RIL populations ranged from 0.29 to 0.60. For straw yield, best predictions were provided by the BL model with 20% genotype missing rate, which displayed an average predicting ability of r = 0.57. Prediction for grain yield displayed the highest accuracy, averaging r = 0.71 (using BL with 20% missing genotype data).
We expected worse GS predictions for alfalfa than for pea RILs, owing to much shorter linkage disequilibrium and the impossibility to exploit non-additive genetic variation in half-sib progeny-based selection of alfalfa parents. However, GS provides greater opportunity for time reduction of selection cycles in a perennial such as alfalfa, justifying our interest even in low predictive ability values in this species.
Our results suggest that GS may already be convenient for breeding programs of alfalfa and pea. However, its incorporation would require important modifications of their selection schemes. This is summarized in Fig. 1, where five basic selection stages are identified whose implementation depends on the reproductive system of the target species (outbred or inbred). The inclusion of GS implies (i) the construction of one or more reference populations, (ii) the definition of one GS model for each target trait in each population using a genotype sample, (iii) the application of the model(s) to a wide set of genotypes from each population, and (iv) the final field test of a reduced set of GS-selected lines (inbreds) or the GS-selected synthetic variety (outbreds). For inbreds, the reference population may conveniently include a set of RILs with partly common ancestors (as here) to facilitate the definition of a common GS model, or one MAGIC population. Key scientific questions remains, inter alia, the ability of GS models built on one population to predict phenotypes of other populations, and the verification of actual yield gains obtained via GS.
References
Annicchiarico P (2005) Scelta varietale per pisello e favino rispetto all’ambiente e all’utilizzo. Informatore Agrario 61(49):47–52
Annicchiarico P (2015) Alfalfa forage yield and leaf/stem ratio: narrow-sense heritability, genetic correlation, and parent selection procedures. Euphytica 205:409–420
Annicchiarico P (2017) Feed legumes for truly sustainable crop-animal systems. Italian J Agron 12(2). https://doi.org/10.4081/ija.2017.880
Annicchiarico P, Iannucci A (2008) Adaptation strategy, germplasm type and adaptive traits for field pea improvement in Italy based on variety responses across climatically contrasting environments. Field Crops Res 108:133–142
Annicchiarico P, Barrett B, Brummer EC, Julier B, Marshall AH (2015a) Achievements and challenges in improving temperate perennial forage legumes. Crit Rev Plant Sci 34:327–380
Annicchiarico P, Nazzicari N, Li X, Wei Y, Pecetti L, Brummer EC (2015b) Accuracy of genomic selection for alfalfa biomass yield in different reference populations. BMC Genom 16:1020
Annicchiarico P, Nazzicari N, Pecetti L, Romani M, Ferrari B, Wei Y, Brummer EC (2017a) GBS-based genomic selection for pea grain yield under severe terminal drought. The Plant Genome 10(2). https://doi.org/10.3835/plantgenome2016.07.0072
Annicchiarico P, Pecetti L, Abdelguerfi A, Bouizgaren A, Carroni AM, Hayek T, Bouzina M, Mezni M (2011) Adaptation of landrace and variety germplasm and selection strategies for lucerne in the Mediterranean basin. Field Crops Res 120:283–291
Annicchiarico P, Ruda P, Sulas C, Pitzalis M, Salis M, Romani M, Carroni AM (2013) Optimal plant type of pea for mixed cropping with cereals. In: Barth S, Milbourne D (eds), Breeding Strategies for Sustainable Forage and Turf Grass Improvement. Springer, Dordrecht, pp 341–346
Annicchiarico P, Thami Alami I, Abbas K, Pecetti L, Melis RAM, Porqueddu C. (2017b) Performance of legume-based annual forage crops in three semi-arid Mediterranean environments. Crop Pasture Sci 8(10-11):932–941. https://doi.org/10.1071/CP17068
Burstin J, Salloignon P, Chabert-Martinello M, Magnin-Robert J-B, Siol M, Jacquin F, Chauveau A, Pont C, Aubert G, Delaitre C, Truntzer C, Duc G (2015) Genetic diversity and trait genomic prediction in a pea diversity panel. BMC Genom 16:105
Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE (2011) A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 6(5):e19379
Heffner EL, Sorrells ME, Jannink J-L (2009) Genomic selection for crop improvement. Crop Sci 49:1–12
Li X, Wei Y, Acharya A, Hansen JL, Crawford JL, Viands DR, Michaud R, Claessens A, Brummer EC (2015) Genomic prediction of biomass yield in two selection cycles of a tetraploid alfalfa breeding population. Plant Genome 8:2
Acknowledgments
This work is part of the EraNet-ARIMNet project ‘Resilient, water- and energy-efficient forage and feed crops for Mediterranean agricultural systems (REFORMA)’ funded for Italy by the Italian Ministry of Agriculture, Food and Forestry Policy. We are grateful to E.C. Brummer and B. Ferrari for scientific support, and S. Proietti, A. Passerini and P. Gaudenzi for valuable technical assistance.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Annicchiarico, P., Nazzicari, N., Pecetti, L., Romani, M. (2018). Genomic Selection for Biomass Yield of Perennial and Annual Legumes. In: Brazauskas, G., Statkevičiūtė, G., Jonavičienė, K. (eds) Breeding Grasses and Protein Crops in the Era of Genomics. Springer, Cham. https://doi.org/10.1007/978-3-319-89578-9_47
Download citation
DOI: https://doi.org/10.1007/978-3-319-89578-9_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-89577-2
Online ISBN: 978-3-319-89578-9
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)