Abstract
Genome-wide association studies (GWAS) have benefited from the advances of sequencing methods for the generation of high-density genomic data. By bridging genotype to phenotype, several genes have been associated with traits of agricultural interest. Despite this, there is still a gap between genotyping and phenotyping due to the large difference in throughput between the two disciplines. Although cutting-edge phenomics technologies are available to the community, their costs are still prohibitive at the small lab level. Semiautomated methods of investigation provide a valid alternative to generate large-scale phenotyping data able to deeply investigate the characteristics of different plant organs. Beyond automation, phenomics data management is another major constraint to consider; while bioinformatics pipelines are well-trained for releasing high-quality genomic data, fewer efforts have been done for phenotyping information. This chapter provides a guide for generating large-scale data related to the size and shape of fruits, leaves, seeds, and roots and for downstream analysis for curation and preparation of clean datasets, through removal of outliers and performing primary statistical analysis. Different steps to be carried out in the R environment will be shown for gathering the appropriate input information to use in GWAS avoiding any possible bias.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brachi B, Morris GP, Borevitz JO (2011) Genome-wide association studies in plants: the missing heritability is in the field. Genome Biol 12:232. https://doi.org/10.1186/gb-2011-12-10-232
Esposito S, Carputo D, Cardi T, Tripodi P (2020) Applications and trends of machine learning in genomics and phenomics for next-generation breeding. Plants 9(1):34. https://doi.org/10.3390/plants9010034
Casci T (2010) Plants are not humans. Nat Rev Genet 11:315. https://doi.org/10.1038/nrg2788
Korte A, Farlow A (2013) The advantages and limitations of trait analysis with GWAS: a review. Plant Methods 9:29. https://doi.org/10.1186/1746-4811-9-29
European Plant Phenotyping Network (EPPN) https://www.plant-phenotyping-network.eu/. Accessed 20 Jun 2021
International Plant Phenotyping Network (IPPN) https://www.plant-phenotyping.org/IPPN_home. Accessed 20 Jun 2021
North American Plant Phenotyping Network (NAPPN) https://nappn.plant-phenotyping.org/. Accessed 20 Jun 2021
Image Software Tools https://www.quantitative-plant.org/software. Accessed 20 Jun 2021
R Development Core Team (2012) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Coucke W, China B, Delattre I, Lenga Y, Van Blerk M, Van Campenhout C, Van de Walle P, Vernelen K, Albert A (2012) Comparison of different approaches to evaluate external quality assessment data. Clin Chim Acta 413:582–586. https://doi.org/10.1016/j.cca.2011.11.030
O’Connor LJ, Price AL (2018) Distinguishing correlation from causation using genome-wide association studies. arXiv:1811.08803
Tripodi P, Soler S, Campanelli G, Díez MJ et al (2021) Genome wide association mapping for agronomic, fruit quality, and root architectural traits in tomato under organic farming conditions. BMC Plant Biol 21:481. https://doi.org/10.1186/s12870-021-03271-4
Fernandes SB, Zhang KS, Jamann TM, Lipka AE (2021) How well can multivariate and univariate GWAS distinguish between true and spurious pleiotropy? Front Plant Sci 11:1–11. https://doi.org/10.3389/fgene.2020.602526
Colonna V, D’Agostino N, Garrison E, Albrechtsen A, Meisner J, Facchiano A, Cardi T, Tripodi P (2019) Genomic diversity and novel genome-wide association with fruit morphology in Capsicum, from 746k polymorphic sites. Sci Rep 9:10067. https://doi.org/10.1038/s41598-019-46136-5
Lever J, Krzywinski M, Altman N (2017) Principal component analysis. Nat Methods 14:641–642. https://doi.org/10.1038/nmeth.4346
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Tripodi, P. (2022). Development, Preparation, and Curation of High-Throughput Phenotypic Data for Genome-Wide Association Studies: A Sample Pipeline in R. In: Torkamaneh, D., Belzile, F. (eds) Genome-Wide Association Studies. Methods in Molecular Biology, vol 2481. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2237-7_7
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2237-7_7
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2236-0
Online ISBN: 978-1-0716-2237-7
eBook Packages: Springer Protocols