Detecting Structural Variants and Associated Gene Presence–Absence Variation Phenomena in the Genomes of Marine Organisms

Sollitto, Marco; Kenny, Nathan J.; Greco, Samuele; Tucci, Carmen Federica; Calcino, Andrew D.; Gerdol, Marco

doi:10.1007/978-1-0716-2313-8_4

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2498))

1125 Accesses
3 Citations
4 Altmetric

Abstract

As complete genomes become easier to attain, even from previously difficult-to-sequence species, and as genomic resequencing becomes more routine, it is becoming obvious that genomic structural variation is more widespread than originally thought and plays an important role in maintaining genetic variation in populations. Structural variants (SVs) and associated gene presence–absence variation (PAV) can be important players in local adaptation, allowing the maintenance of genetic variation and taking part in other evolutionarily relevant phenomena. While recent studies have highlighted the importance of structural variation in Mollusca, the prevalence of this phenomenon in the broader context of marine organisms remains to be fully investigated.

Here, we describe a straightforward and broadly applicable method for the identification of SVs in fully assembled diploid genomes, leveraging the same reads used for assembly. We also explain a gene PAV analysis protocol, which could be broadly applied to any species with a fully sequenced reference genome available. Although the strength of these approaches have been tested and proven in marine invertebrates, which tend to have high levels of heterozygosity, possibly due to their lifestyle traits, they are also applicable to other species across the tree of life, providing a ready means to begin investigations into this potentially widespread phenomena.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Massive gene presence-absence variation shapes an open pan-genome in the Mediterranean mussel

Article Open access 10 November 2020

Long-read based assembly and synteny analysis of a reference Drosophila subobscura genome reveals signatures of structural evolution driven by inversions recombination-suppression effects

Article Open access 18 March 2019

An improved genome assembly uncovers prolific tandem repeats in Atlantic cod

Article Open access 18 January 2017

References

Feuk L, Marshall CR, Wintle RF et al (2006) Structural variants: changing the landscape of chromosomes and design of disease studies. Hum Mol Genet 15:R57–R66
Article CAS PubMed Google Scholar
Marroni F, Pinosio S, Morgante M (2014) Structural variation and genome complexity: is dispensable really dispensable? Curr Opin Plant Biol 18:31–36
Article CAS PubMed Google Scholar
Read BA, Emiliania huxleyi Annotation Consortium, Kegel J et al (2013) Pan genome of the phytoplankton Emiliania underpins its global distribution. Nature 499(7457):209–213. https://doi.org/10.1038/nature12221
Article CAS PubMed Google Scholar
McInerney JO, McNally A, O’Connell MJ (2017) Why prokaryotes have pangenomes. Nat Microbiol 2:17040. https://doi.org/10.1038/nmicrobiol.2017.40
Article CAS PubMed Google Scholar
Medini D, Donati C, Tettelin H et al (2005) The microbial pan-genome. Curr Opin Genet Dev 15:589–594
Article CAS PubMed Google Scholar
Vernikos G, Medini D, Riley DR et al (2015) Ten years of pan-genome analyses. Curr Opin Microbiol 23:148–154
Article CAS PubMed Google Scholar
Aherfi S, Andreani J, Baptiste E et al (2018) A Large Open Pangenome and a Small Core Genome for Giant Pandoraviruses. Front Microbiol 9:1486. https://doi.org/10.3389/fmicb.2018.01486
Article PubMed PubMed Central Google Scholar
Song J-M, Guan Z, Hu J et al (2020) Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus. Nat Plants 6:34–45
Article CAS PubMed PubMed Central Google Scholar
Alonge M, Wang X, Benoit M et al (2020) Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 182:145–161.e23
Article CAS PubMed PubMed Central Google Scholar
Golicz AA, Bayer PE, Bhalla PL et al (2020) Pangenomics comes of age: from bacteria to plant and animal applications. Trends Genet 36:132–145
Article CAS PubMed Google Scholar
McCarthy CGP, Fitzpatrick DA (2019) Pan-genome analyses of model fungal species. Microb Genom 5:e000243
PubMed Central Google Scholar
Sherman RM, Forman J, Antonescu V et al (2019) Assembly of a pan-genome from deep sequencing of 910 humans of African descent. Nat Genet 51:30–35
Article CAS PubMed Google Scholar
Tian X, Li R, Fu W et al (2020) Building a sequence map of the pig pan-genome from multiple de novo assemblies and Hi-C data. Sci China Life Sci 63:750–763
Article PubMed Google Scholar
Li R, Li Y, Zheng H et al (2010) Building the sequence map of the human pan-genome. Nat Biotechnol 28:57–63
Article CAS PubMed Google Scholar
Rosa RD, Alonso P, Santini A et al (2015) High polymorphism in big defensin gene expression reveals presence–absence gene variability (PAV) in the oyster Crassostrea gigas. Dev Comp Immunol 49(2):231–238. https://doi.org/10.1016/j.dci.2014.12.002
Article CAS PubMed Google Scholar
Gerdol M, Moreira R, Cruz F et al (2020) Massive gene presence-absence variation shapes an open pan-genome in the Mediterranean mussel. Genome Biol 21:275
Article CAS PubMed PubMed Central Google Scholar
Vos M, Eyre-Walker A (2017) Are pangenomes adaptive or not? Nat Microbiol 2:1576–1576
Article CAS PubMed Google Scholar
Calcino AD, Kenny NJ, Gerdol M (2021) Single individual structural variant detection uncovers widespread hemizygosity in molluscs. Philos Trans R Soc Lond Ser B Biol Sci 376:20200153
Article CAS Google Scholar
Martinez AS, Willoughby JR, Christie MR (2018) Genetic diversity in fishes is influenced by habitat type and life-history variation. Ecol Evol 8:12022–12031
Article PubMed PubMed Central Google Scholar
Olsen KC, Ryan WH, Winn AA et al (2020) Inbreeding shapes the evolution of marine invertebrates. Evolution 74:871–882
Article PubMed PubMed Central Google Scholar
Seppey M, Manni M, Zdobnov EM (2019) BUSCO: assessing genome assembly and annotation completeness. Methods Mol Biol 1962:227–245
Article CAS PubMed Google Scholar
Zdobnov EM, Tegenfeldt F, Kuznetsov D et al (2017) OrthoDB v9.1: cataloging evolutionary and functional annotations for animal, fungal, plant, archaeal, bacterial and viral orthologs. Nucleic Acids Res 45:D744–D749
Article CAS PubMed Google Scholar
Bushnell B. et al. (2014) BBMap: A Fast, Accurate, Splice-Aware Aligner. No. LBNL-7065E. Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA.
Google Scholar
Neph S, Kuehn MS, Reynolds AP et al (2012) BEDOPS: high-performance genomic feature operations. Bioinformatics 28:1919–1920
Article CAS PubMed PubMed Central Google Scholar
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842
Article CAS PubMed PubMed Central Google Scholar
Li H, Durbin R (2009) Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics 25:1754–1760
Article CAS PubMed PubMed Central Google Scholar
Li H (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. http://github.com/lh3/bwa
fastp, Github. https://github.com/OpenGene/fastp
Andrews S FastQC, Github. https://github.com/s-andrews/FastQC
Marçais G, Kingsford C (2011) A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27:764–770
Article PubMed PubMed Central CAS Google Scholar
Pedersen BS, Quinlan AR (2018) Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34:867–868
Article CAS PubMed Google Scholar
Harris CR, Millman KJ, van der Walt SJ et al (2020) Array programming with NumPy. Nature 585:357–362
Article CAS PubMed PubMed Central Google Scholar
McKinney W (2010) Data Structures for Statistical Computing in Python. Proceedings of The 9th Python in Science Conference, pp. 51-56. https://doi.org/10.25080/majora-92bf1922-00a
Pacific Biosciences (2017) pbmm2, Github. https://github.com/PacificBiosciences/pbmm2
Pacific Biosciences (2017) pbsv, Github. https://github.com/PacificBiosciences/pbsv
Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Article PubMed PubMed Central CAS Google Scholar
Virtanen P, Gommers R, Oliphant TE et al (2020) Author correction: SciPy 1.0: fundamental algorithms for scientific computing in python. Nat Methods 17:352
Article CAS PubMed PubMed Central Google Scholar
Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27:573–580
Article CAS PubMed PubMed Central Google Scholar
Li H (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100
Article CAS PubMed PubMed Central Google Scholar
Wingett SW, Andrews S (2018) FastQ screen: a tool for multi-genome mapping and quality control. F1000Res 7:1338
Article PubMed PubMed Central Google Scholar
Danecek P, Bonfield JK, Liddle J et al (2021) Twelve years of SAMtools and BCFtools. Gigascience 10:giab008
Article PubMed PubMed Central CAS Google Scholar
Falcon S, Gentleman R (2008) Hypergeometric testing used for gene set enrichment. Analysis:207–220. https://doi.org/10.1007/978-0-387-77240-0_14
Ashburner M, Ball CA, Blake JA et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29
Article CAS PubMed PubMed Central Google Scholar
Gene Ontology Consortium (2021) The gene ontology resource: enriching a GOld mine. Nucleic Acids Res 49:D325–D334
Article CAS Google Scholar
Mistry J, Chuguransky S, Williams L et al (2021) Pfam: the protein families database in 2021. Nucleic Acids Res 49:D412–D419
Article CAS PubMed Google Scholar
Jones P, Binns D, Chang H-Y et al (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240
Article CAS PubMed PubMed Central Google Scholar
Blum M, Chang H-Y, Chuguransky S et al (2021) The InterPro protein families and domains database: 20 years on. Nucleic Acids Res 49:D344–D354
Article CAS PubMed Google Scholar
Stancu MC, van Roosmalen MJ, Renkens I et al (2017) Mapping and phasing of structural variation in patient genomes using nanopore sequencing. Nat Commun 8:1–13
CAS Google Scholar
Heller D, Vingron M (2019) SVIM: structural variant identification using mapped long reads. Bioinformatics 35:2907–2915
Article CAS PubMed PubMed Central Google Scholar
Jiang T, Liu Y, Jiang Y et al (2020) Long-read-based human genomic structural variation detection with cuteSV. Genome Biol 21:189
Article CAS PubMed PubMed Central Google Scholar
Rhie A, Walenz BP, Koren S et al (2020) Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol 21:245
Article CAS PubMed PubMed Central Google Scholar
Bemm F, Weiß CL, Schultz J et al (2016) Genome of a tardigrade: horizontal gene transfer or bacterial contamination? Proc Natl Acad Sci U S A 113(22):E3054–E3056
Article CAS PubMed PubMed Central Google Scholar
Espinas NA, Tu LN, Furci L et al (2020) Transcriptional regulation of genes bearing intronic heterochromatin in the rice genome. PLoS Genet 16:e1008637
Article CAS PubMed PubMed Central Google Scholar
Laetsch DR, Blaxter ML (2017) BlobTools: interrogation of genome assemblies. F1000Res 6:1287
Article Google Scholar
Wood DE, Lu J, Langmead B (2019) Improved metagenomic analysis with kraken 2. Genome Biol 20:257
Article CAS PubMed PubMed Central Google Scholar
Gaudet P, Dessimoz C (2017) Gene ontology: pitfalls, biases, and remedies. Methods Mol Biol 1446:189–205
Article CAS PubMed Google Scholar
Khalturin K, Hemmrich G, Fraune S et al (2009) More than just orphans: are taxonomically-restricted genes important in evolution? Trends Genet 25:404–413
Article CAS PubMed Google Scholar

Download references

Author information

Nathan J. Kenny
Present address: Department of Biochemistry, University of Otago, Dunedin, New Zealand

Authors and Affiliations

Department of Life Sciences, Università degli Studi di Trieste, Trieste, Italy
Marco Sollitto, Samuele Greco, Carmen Federica Tucci & Marco Gerdol
Faculty of Health and Life Sciences, Oxford Brookes, Oxford, UK
Nathan J. Kenny
Department of Evolutionary Biology, Integrative Zoology, University of Vienna, Vienna, Austria
Andrew D. Calcino

Authors

Marco Sollitto
View author publications
You can also search for this author in PubMed Google Scholar
Nathan J. Kenny
View author publications
You can also search for this author in PubMed Google Scholar
Samuele Greco
View author publications
You can also search for this author in PubMed Google Scholar
Carmen Federica Tucci
View author publications
You can also search for this author in PubMed Google Scholar
Andrew D. Calcino
View author publications
You can also search for this author in PubMed Google Scholar
Marco Gerdol
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Gerdol .

Editor information

Editors and Affiliations

Institute of Biosciences and BioResources (IBBR), National Research Council, Naples, Italy
Cinzia Verde
Institute of Biosciences and BioResources (IBBR), National Research Council, Naples, Italy
Daniela Giordano

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Sollitto, M., Kenny, N.J., Greco, S., Tucci, C.F., Calcino, A.D., Gerdol, M. (2022). Detecting Structural Variants and Associated Gene Presence–Absence Variation Phenomena in the Genomes of Marine Organisms. In: Verde, C., Giordano, D. (eds) Marine Genomics. Methods in Molecular Biology, vol 2498. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2313-8_4

Download citation

DOI: https://doi.org/10.1007/978-1-0716-2313-8_4
Published: 22 June 2022
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2312-1
Online ISBN: 978-1-0716-2313-8
eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Detecting Structural Variants and Associated Gene Presence–Absence Variation Phenomena in the Genomes of Marine Organisms

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Massive gene presence-absence variation shapes an open pan-genome in the Mediterranean mussel

Long-read based assembly and synteny analysis of a reference Drosophila subobscura genome reveals signatures of structural evolution driven by inversions recombination-suppression effects

An improved genome assembly uncovers prolific tandem repeats in Atlantic cod

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Detecting Structural Variants and Associated Gene Presence–Absence Variation Phenomena in the Genomes of Marine Organisms

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Massive gene presence-absence variation shapes an open pan-genome in the Mediterranean mussel

Long-read based assembly and synteny analysis of a reference Drosophila subobscura genome reveals signatures of structural evolution driven by inversions recombination-suppression effects

An improved genome assembly uncovers prolific tandem repeats in Atlantic cod

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Search

Navigation