Metagenome sequencing and 107 microbial genomes from seamount sediments along the Yap and Mariana trenches

Zhang, Yue; Jing, Hongmei

doi:10.1038/s41597-024-03762-7

Metagenome sequencing and 107 microbial genomes from seamount sediments along the Yap and Mariana trenches

Data Descriptor
Open access
Published: 15 August 2024

Volume 11, article number 887, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Data

Metagenome sequencing and 107 microbial genomes from seamount sediments along the Yap and Mariana trenches

Download PDF

540 Accesses
9 Altmetric
1 Mention
Explore all metrics

Abstract

Microbes in the sediments across a series of seamounts along the island arc of the Yap and Mariana trenches were investigated by metagenome. In this study, we reconstructed 107 metagenome-assembled genomes (MAGs), including 100 bacteria and 7 archaea. All the MAGs exhibited >75% completeness and <10% contamination, with 26 MAGs being classified as ‘nearly complete’ (completeness >90%), while 50 falling within 80–90% range and 31 between 75–80% complete. Phylogenomic analysis revealed that 86% (n = 92) of these MAGs represented new taxa at different taxonomical levels. The species composition of these MAGs was most consistent with the previous reports, with the most abundant phyla being Proteobacteria (n = 39), Methylomirabilota (n = 27), and Nitrospirota (n = 7). These draft genomes provided novel data on species diversity and function in the seamount microbial community, which will provide reference data for extensive comparative genomic studies across crucial phylogenetic groups worldwide.

Metagenome-assembled genomes provide new insight into the microbial diversity of two thermal pools in Kamchatka, Russia

Article Open access 28 February 2019

500 metagenome-assembled microbial genomes from 30 subtropical estuaries in South China

Article Open access 16 June 2022

Metagenome sequencing and 768 microbial genomes from cold seep in South China Sea

Article Open access 06 August 2022

Background & Summary

Seamounts are defined as abrupt rising structure from the seafloor with height greater than 100 m below the sea surface¹, and there are more than 170,000 seamounts distributed across the global seafloor². Seamounts represent unique marine environments, their specific topographic characteristics and complex hydrodynamics directly or indirectly enrich the concentrations of inorganic nutrients and particle organic matter, and was proposed as ‘oasis’ harboring generally higher biomass than surrounding waters³. Hydrological dynamics produced by seamounts could cause significant disturbance to the surrounding water bodies, thus impacts on the metabolic functions, taxonomy diversity and population distributions of microbes⁴. Therefore, a comprehensive insight into the diversity and distribution patterns of microbial communities around the deep-sea seamounts is crucial.

Typical oligotrophic characteristics, complex hydrological characteristics and massive seamounts make the western Pacific Ocean become an ideal region to study the effect of seamount on microbes⁵. The Yap and Mariana trenches were formed by the collision of plates⁶. Yap-Mariana Junction cuts across the Mariana Ridge and Yap Ridge, and is located just to the west of the Mariana Trench. Formed by volcanic magmatic activity associated with plate subduction and compression, a series of seamounts are located on the island arc of the two trenches⁷. In recent years, the microbial diversity of seamounts has been largely investigated by 16 S rDNA amplicon sequencing^{7,8,9,10,11,12}. However, amplicon analysis focusing on one or a few gene regions often fails to distinguish closely related species when assessing community diversity. Alternatively, metagenomics provides abundant gene information about microbes through high-throughput sequencing, and the assembly of these genes could identify a large number of uncultured microbes¹³. In this study, we further demonstrated the potential microbial diversity by retrieving and assembling their metagenomic sequences into near complete microbial genomes, because metagenome-assembled genomes (MAGs) can provide more accurate information about microbial species and their communities^14,15.

We successfully reconstructed 107 MAGs by collecting sediment samples from various locations along the two Trenches. These locations included the summit, flank, and base of seamounts and the deepest point of the Challenger Deep of the Mariana Trench as a control (Table 1; Fig. 1a–c). All of these MAGs have a completeness of >75% with a contamination <10%. In other words, all of the 107 MAGs meet the medium quality of the MIMAG standards¹⁶. Of these MAGs, 26 (24%) were ‘near complete’ (completeness >90%), 50 (47%) were >80% completeness, and 31 (29%) were >75% completeness (Table S1). In addition, 60 (56%) MAGs had <5% contamination, and 2 (2%) MAGs had no contamination at all (Tables S1). A total of 40 (37%) MAGs had a N50 length greater than 10,000 bp (Table S1), indicating excellent assembly quality. The genome size that was calculated from MAG completeness using CheckM v1.2.2¹⁷, ranged from 1.00 to 7.62 Mbp, with an average value of 2.35 Mbp (Table S1) Overviews of the MAGs were presented in Fig. 2. At the phylum level, Thermoplasmatota had the highest GC content (average 64.06%), in contrast, Bacteroidota had the lowest GC content (41.25%, Tables S1, S2; Fig. 2e). There was no significant correlation between genome size and N50 length (Fig. 2c). Of all the MAGs, there was no correlation between their completeness and contamination, despite the fact that MAGs with much lower completeness (completeness < 80%) usually had lower contamination (Table S1; Fig. 2d). According to the Genome Taxonomy Database (GTDB)¹⁸, these draft genomes were classified into 100 bacteria and 7 archaea (Fig. 1c). A total of 15 phyla were identified; the most abundant phyla were Proteobacteria (n = 39), Methylomirabilota (n = 27) and Nitrospirota (n = 7) (Figs. 2a, 3). Notably, 92 (86%) MAGs cannot be assigned to any named entry in GTDB, indicating that most of these MAGs represent novel taxa (Table S2). In sum, 2 class, 3 order, 21 families, 32 genera, and 34 species (57 bacteria and 6 archaea) were novel taxa (Table S2; Fig. 2b). The abundance of these MAGs varied among different samples; in general, B02 had more MAGs than others (Fig. 4). The repertoire of such microbial genomes from seamounts can further facilitate the understanding of the species diversity, structure and function of these microbial communities, which will provide reference data for extensive comparative genomic studies across crucial phylogenetic groups worldwide.

Table 1 The environmental variables and sequence information of sediments collected from seamounts along the Yap and Mariana trenches.

Full size table

Methods

Sample collection and metagenomic sequencing

Sediment samples were collected from a series of seamounts along the Yap Island Arc (YIA: SY222 and SY223), Yap-Mariana Junction area (YMJ: SY203, SY206, SY207 and SY220), Mariana Island Arc (MIA: SY190, SY191, SY192, SY194, SY196 and SY212) and the Challenger Deep (CD: B02), using a pushcore, during cruise TS14 on R/V “Tan Suo Yi Hao” in September 2019 (Fig. 1). In situ hydrographic parameters (i.e., location, depth, temperature and salinity) were measured with the manned submersible, SHENHAI YONGSHI. Three stations (SY220, SY212 and SY223) contained samples of summit, flank and base of seamounts. The surface (0–4 cm) and subsurface (4–8 cm, SY206 and SY220-base) deposits were immediately stored at −80°C for further analysis. Before sediment characteristics analysis, samples were dried in an oven. The concentrations of total organic carbon (TOC), total nitrogen (TN) ammonia (NH₄⁺) and nitrate (NO₃⁻) were determined according to Wang et al.¹⁹. In short, TOC and TN contents were estimated by an element analyzer (Elementar vario Macro cube, Germany) on 5 g of dried sediments. NO₃⁻ and NH₄⁺ were extracted with 2 M HCl and determined using a colorimetric auto-analyzer (AutoAnalyzer 3, SEAL Analytical, Germany).

Total genomic DNA were extracted from sediment samples using the MoBio PowerSoil DNA extraction kit following the manufacturer’s instructions. The quantity of extracted DNA was measured using the Qubit dsDNA assay kit in combination with a Qubit^® 2.0 fluorometer (Life Technologies, USA) and verified by 1% agarose gel electrophoresis. The paired-end sequencing was performed on the Illumina NovoSeq 6000 platform (Illumina Inc., San Diego, CA, USA) at Novogene Co., Ltd. (www.novogene.com).

Quality control and assembly

The quality filtering of short reads was achieved by removing the adapters and barcodes, as well as reads containing poly-N or that were of low-quality from the raw data using the FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_toolkit) and Fastqc softwares (https://github.com/s-andrews/FastQC). Then all of the quality-controlled reads were co-assembled with MEGAHIT v1.2.9 with parameters ‘--k-min 21 --k-max 144 --k-step 10’²⁰. The quality of the assembly was assessed using QUAST v5.0.2²¹.

Genome binning, refinement, and dereplication

Based on tetranucleotide frequencies, coverage, and GC content, genome bins were recovered using the MetaWRAP v1.3.2 pipeline (parameters: default)²², including MaxBin 2.0²³, metaBAT 2.0²⁴ and CONCOCT v1.0.0²⁵ metagenomic binning software. The binning results were refined using the MetaWRAP-Bin_refinement module (parameters: -c 50 -x 10). A lineage-specific work flow of CheckM was used to estimate the completeness and contamination of these genome bins. The refinement bins were dereplicated using dRep v2.6.2²⁶ (parameters: -comp 50 -con 10) at the 95% average nucleotide identity (ANI).

Taxonomic classification and Phylogenomic analysis of MAGs

The classification of 100 MAGs was performed by the classify_wf workflow of GTDB-TK v2.0.0²⁷ with GTDB release 207 (parameters: default). A phylogenetic tree of 100 species-level bacterial MAGs was constructed by 120 bacterial marker genes using the gtdbtk infer module in GTDB-TK (parameters: default). The tree was annotated and visualized by iTOL v5²⁸.

Data Records

The raw reads and MAGs of these metagenomic datasets have been deposited in the NCBI under BioProject ID PRJNA1131620²⁹. Sequence Read Archive (SRA) accession number SRP517910³⁰. Additionally, the MAGs are available in the NCBI with the Sequence Read Archive (SRA) entries under accessions SRP517910 and the figshare³¹.

Technical Validation

To avoid contamination of samples, all sampling tools and containers have been sterilized before sampling. After the samples were obtained, they were immediately placed on −80 °C and kept away from light. DNA extraction was carried out in a specialized lab area, the entire sample processing was expedited and completed within 48 hours. We consistently used the PowerSoil DNA Isolation Kit for sediment samples from the same batch to ensure uniformity. To guarantee the integrity of the assembled contigs, different k-mer sizes were selectively used during the MEGAHIT assembly process (ranging from 21 to 141, step by 10). Following assembly, rigorous binning standards were applied, and the sequences obtained post-binning were re-assembled to ensure the highest possible quality of the resulting data. The completeness and contamination of the draft genomes were validated using CheckM.

Usage Notes

Investigating the microorganisms in seamount sediments is crucial for understanding microbial ecology and evolution. This study provides comprehensive metagenomic and microbial genomic datasets from the seamount sediments along the Yap and Mariana trenches. These datasets were acquired using a next-generation sequencing platform and a commonly used metagenomic analysis pipeline. Detailed information about the samples was provided in Table 1. Metagenome sequencing statistics for the MAGs results are listed in Tables S1 and S2.

Ethics approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Code availability

Custom scripts used to generate or process this dataset were deposited in the figshare (https://doi.org/10.6084/m9.figshare.26139184.v1). Software versions and non-default parameters used have been appropriately specified where required.

References

Staudigel, H., Koppers, A. A., Lavelle, J. W., Pitcher, T. J. & Shank, T. M. Defining the word “seamount”. Oceanography 23(1), 20–21 (2010).
Article Google Scholar
Sonnekus, M. J., Bornman, T. G. & Campbell, E. E. Phytoplankton and nutrient dynamics of six south West Indian Ocean seamounts. Deep Sea Res., Part II 136, 59–72 (2017).
Article CAS Google Scholar
Morato, T., Hoyle, S. D., Allain, V. & Nicol, S. J. Seamounts are hotspots of pelagic biodiversity in the open ocean. Proc. Natl. Acad. Sci. USA 107, 9707–9711 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, J. et al. Bacterial community structure and novel species of magnetotactic bacteria in sediments from a seamount in the Mariana volcanic arc. Sci. Rep. 7, 17964 (2017).
Article ADS PubMed PubMed Central Google Scholar
Ma, J. et al. Environmental characteristics in three seamount areas of the tropical western Pacific Ocean: focusing on nutrients. Mar. Pollut. Bull. 143, 163−–174 (2019).
Article PubMed Google Scholar
Crawford, A. J., Beccaluva, L., Serri, G. & Dostal, J. Petrology, geochemistry and tectonic implications of volcanics dredged from the intersection of the Yap and Mariana trenches. Earth Planet. Sci. Lett. 80, 265–280 (1986).
Article ADS CAS Google Scholar
Xu, K. Exploring seamount ecosystems and biodiversity in the tropical Western Pacific Ocean. J. Oceanol. Limnol. 39, 1585–1590 (2021).
Article ADS Google Scholar
Sunamura, M., Higashi, Y., Miyako, C., Ishibashi, J. & Maruyama, A. Two bacteria phylotypes are predominant in the Suiyo Seamount hydrothermal plume. Appl. Environ. Microbiol. 70(2), 1190–1198 (2004).
Article ADS CAS PubMed PubMed Central Google Scholar
Sudek, L. A., Templeton, A. S., Tebo, B. M. & Staudigel, H. Microbial ecology of Fe (hydr)oxide mats and basaltic rock from Vailulu’u Seamount, American Samoa. Geomicrobiol. J. 26, 581–596 (2009).
Article CAS Google Scholar
Mottl, M. J., Komor, S. C., Fryer, P. & Moyer, C. L. Deep-slab fluids fuel extremophilic Archaea on a Mariana forearc serpentinite mud volcano: Ocean Drilling Program Leg 195. Geochem. Geophys. Geosyst. 4, 9009 (2003).
Article ADS Google Scholar
Davis, R. E. & Moyer, C. L. Extreme spatial and temporal variability of hydrothermal microbial mat communities along the Mariana Island Arc and southern Mariana back-arc system. J. Geophys. Res. 113(B8), 325–334 (2008).
Google Scholar
Zhang, Y. & Jing, H. Deterministic process controlling the prokaryotic community assembly across seamounts along in the Yap and Mariana trenches. Ecol. Indic. 158, 111538 (2024).
Article Google Scholar
Nishimura, Y. & Yoshizawa, S. The OceanDNA MAG catalog contains over 50,000 prokaryotic genomes originated from various marine environments. Sci. Data 9, 305 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zhou, L., Huang, S. H., Gong, J. Y., Xu, P. & Huang, X. D. 500 metagenome-assembled microbial genomes from 30 subtropical estuaries in South China. Sci. Data 9, 301 (2022).
Article Google Scholar
Haroon, M. F., Thompson, L. R., Parks, D. H., Hugenholtz, P. & Stingl, U. A catalogue of 136 microbial draft genomes from Red Sea metagenomes. Sci. Data 3, 160050 (2016).
Article CAS PubMed PubMed Central Google Scholar
Bowers, R. M. et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat. Biotechnol. 35, 725–731 (2017).
Article CAS PubMed PubMed Central Google Scholar
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
Article CAS PubMed PubMed Central Google Scholar
Parks, D. H. et al. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res. 50, D785–D794, https://identifiers.org/ncbi/insdc.sra:SRP423788 (2022). NCBI Sequence Read Archive(2023).
Article CAS PubMed Google Scholar
Wang, J. P., Wu, Y. H., Zhou, J., Bing, H. J. & Sun, H. Y. Carbon demand drives microbial mineralization of organic phosphorus during the early stage of soil development. Biol. Fertil. Soils 52, 825–839 (2016).
Article CAS Google Scholar
Li, D. H., Liu, C. M., Luo, R. B., Sadakane, K. & Lam, T. W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).
Article CAS PubMed Google Scholar
Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).
Article CAS PubMed PubMed Central Google Scholar
Uritskiy, G. V., DiRuggiero, J. & Taylor, J. MetaWRAP-a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6, 158 (2018).
Article PubMed PubMed Central Google Scholar
Wu, Y. W., Simmons, B. A. & Singer, S. W. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605–607 (2016).
Article CAS PubMed Google Scholar
Kang, D. W. D., Froula, J., Egan, R. & Wang, Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ 3, e1165 (2015).
Article PubMed PubMed Central Google Scholar
Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nature Methods 11, 1144–1146 (2014).
Article CAS PubMed Google Scholar
Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 11, 2864–2868 (2017).
Article CAS PubMed PubMed Central Google Scholar
Chaumeil, P. A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2020).
Article CAS Google Scholar
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, 293–296 (2021).
Article Google Scholar
NCBI Bioproject. https://identifiers.org/ncbi/bioproject:PRJNA1131620 (2024).
Nucleotide Sequence Archive. https://identifiers.org/ncbi/insdc.sra:SRP517910 (2024).
Figshare. https://doi.org/10.6084/m9.figshare.26139184.v1 (2024).

Download references

Acknowledgements

This work was supported by the National Key R&D Program of China (2022YFC2805505; 2022YFC2805400; 2022YFC2805304), the Innovational Fund for Scientific and Technological Personnel of Hainan Province (KJRC2023C37), and the International Partnership Program of Chinese Academy of Sciences for Big Science (183446KYSB20210002). We thank the pilots of the deep-sea HOV “Shenhaiyongshi”, the crew of the R/V “Tan Suo Yi Hao” for their professional service during the cruise of TS14. We would like to thank the Institutional Center for Shared Technologies and Facilities of IDSSE, CAS for measurements of the water chemistry.

Author information

Authors and Affiliations

Institute of Deep-sea Science and Engineering, Chinese Academy of Sciences, Sanya, China
Yue Zhang & Hongmei Jing
HKUST-CAS Sanya Joint Laboratory of Marine Science Research, Chinese Academy of Sciences, Sanya, China
Yue Zhang & Hongmei Jing

Authors

Yue Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hongmei Jing
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.Z.: Data analysis, manuscript writing. H.J.: Experimental Design, manuscript editing.

Corresponding author

Correspondence to Hongmei Jing.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhang, Y., Jing, H. Metagenome sequencing and 107 microbial genomes from seamount sediments along the Yap and Mariana trenches. Sci Data 11, 887 (2024). https://doi.org/10.1038/s41597-024-03762-7

Download citation

Received: 09 July 2024
Accepted: 07 August 2024
Published: 15 August 2024
DOI: https://doi.org/10.1038/s41597-024-03762-7
Springer Nature Limited

Metagenome sequencing and 107 microbial genomes from seamount sediments along the Yap and Mariana trenches

Abstract

Similar content being viewed by others

Metagenome-assembled genomes provide new insight into the microbial diversity of two thermal pools in Kamchatka, Russia

500 metagenome-assembled microbial genomes from 30 subtropical estuaries in South China

Metagenome sequencing and 768 microbial genomes from cold seep in South China Sea

Background & Summary

Methods

Sample collection and metagenomic sequencing

Quality control and assembly

Genome binning, refinement, and dereplication

Taxonomic classification and Phylogenomic analysis of MAGs

Data Records

Technical Validation

Usage Notes

Ethics approval

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary material

Rights and permissions

About this article

Cite this article

Navigation

Metagenome sequencing and 107 microbial genomes from seamount sediments along the Yap and Mariana trenches

Abstract

Similar content being viewed by others

Metagenome-assembled genomes provide new insight into the microbial diversity of two thermal pools in Kamchatka, Russia

500 metagenome-assembled microbial genomes from 30 subtropical estuaries in South China

Metagenome sequencing and 768 microbial genomes from cold seep in South China Sea

Background & Summary

Methods

Sample collection and metagenomic sequencing

Quality control and assembly

Genome binning, refinement, and dereplication

Taxonomic classification and Phylogenomic analysis of MAGs

Data Records

Technical Validation

Usage Notes

Ethics approval

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation