Background & Summary

The Changjiang Estuary and adjacent East China Sea, as the largest marginal sea in the western Pacific1, are characterized by complex hydrological environment, where the Changjiang River freshwater, the Taiwan Warm Current, and the Yellow Sea coastal water mix2,3. Hypoxia, with dissolved oxygen (DO) levels < 2 mg/L, poses a severe problem in marine ecosystem4. Estuarine and coastal areas often experience hypoxia due to nutrients and organic matter enrichment5. Hypoxia creates profound ecological and economic consequences, including loss in biodiversity, decline in seagrass meadows, coral bleaching and mortality, fisheries collapse, and marine mammal mortalities6,7,8. Previous studies have linked the generation and development of hypoxia in estuaries and coastal waters to intensifying eutrophication caused by anthropogenic nutrient input9,10,11. In the East China Sea, hypoxia frequently occurs off the Changjiang Estuary, covering up to approximately 15,000 km2 areas12. The hypoxic area in the Changjiang Estuary and adjacent East China Sea is considered as one of the largest low-oxygen zones globally13,14. The recurring hypoxic occurrences in the Changjiang Estuary offer a natural laboratory to study how hypoxia influence microbial diversity and their function.

Marine microorganisms are highly abundant with a massive collective biomass, playing significant roles in biogeochemical cycles15. Hypoxia forces the energy flow to increasingly shifts from higher trophic levels of predators towards lower trophic levels of microbial community, influencing carbon and nutrient cycling16,17. Studies have shown that microbial life can thrive in low-oxygen environments18,19. Marine microorganisms often respond first to recover from ecological disturbances, serving as crucial bioindicators for assessing environmental shifts due to anthropogenic and natural factors20. Much of the current knowledge about microbial communities in low-oxygen regions focuses on oxygen minimum zones7,21,22,23. However, research on microbial community structure and metabolic functions in coastal hypoxic zones remains limited.

This study conducted a comprehensive space-time sampling expedition in the Changjiang Estuary and adjacent East China Sea from July to August 2022 (Fig. 1). Sample metadata was available in Table S1. Time-series sampling covering a 22-day period was conducted at S07 and S08. DO concentrations ranged from 2.73 to 4.54 mg/L in the bottom water, indicating the presence of low-oxygen environments. We collected size-fractionated bacteria (free-living: 0.2–3 μm; particle-associated: 3–20 μm) to distinguish different bacterial lifestyles at S07 and S08. Then we conducted sampling from the Changjiang Estuary to offshore, where DO concentrations ranged from 3.99 to 1.12 mg/L in the bottom water. Hypoxia was observed in the bottom layers of S03 and S09, with the DO concentration value of 1.12 and 1.98 mg/L, respectively. Free-living samples (0.2–3 μm) were collected at these sites. The metagenomes were sequenced, with each yielding approximately 10.2–19.6 Gbp of raw data. Illumina shotgun metagenome sequencing yielded a total of 1.31 Tbp, distributed across 103 samples corresponding to 8 vertical profiles. We further reconstructed 1,559 MAGs. Among them, 32.5% (n = 508) were high-quality (Completeness > 90% and Contamination < 10%), and 46.7% (n = 728) were medium-quality (Completeness: 70–90% and Contamination < 10%). Taxonomic classification, based on the Genome Taxonomy Database (GTDB), assigned these MAGs to 5 archaeal (181 MAGs) and 18 bacterial phyla (1,378 MAGs). Pseudomonadota (n = 658), Bacteroidota (n = 330), Actinomycetota (n = 251), Verrucomicrobiota (n = 37), and Planctomycetota (n = 30) were the dominant bacterial phyla (Fig. 2 and Table S2). The phylum Pseudomonadota was the most abundant, with MAGs assigned to Alphaproteobacteria (n = 319) and Gammaproteobacteria (n = 339). Archaeal phyla included Thermoplasmatota (n = 118), Thermoproteota (n = 52), Nanoarchaeota (n = 5), Asgardarchaeota (n = 4), and Huberarchaeota (n = 2) (Fig. 3 and Table S2). Detailed information of the MAGs was provided in Table S2. As far as we know, this is the first report of  a multiple MAGs dataset being recovered from the coastal hypoxic areas. This collection of microbial genomes from deoxygenated seawater offers valuable insights into the species diversity, structure, and function of these communities, potentially revealing the ecological roles of specific taxa in estuarine-offshore environments.

Fig. 1
figure 1

Sampling sites in the Changjiang Estuary and adjacent East China Sea. The black dots represent stations where samples were collected from various depths. The inset map highlights time-series sampling conducted at stations S07 and S08 near Dongfushan Island. Detailed sample metadata can be found in Table S1.

Fig. 2
figure 2

The phylogenomic tree of 1,378 bacterial MAGs reconstructed from the Changjiang Estuary and adjacent East China Sea. The universally 120 conserved bacterial markers genes were used to build this maximum-likelihood phylogenomic tree. Detailed MAGs taxonomy assignment, associated with completeness and contamination information can be found in Table S2.

Fig. 3
figure 3

The phylogenomic tree of 181 archaeal MAGs reconstructed from the Changjiang Estuary and adjacent East China Sea. The universally 122 conserved archaeal markers genes were used to build this maximum-likelihood phylogenomic tree. Detailed MAGs taxonomy assignment, associated with completeness and contamination information can be found in Table S2.

Methods

Sample sites and sample collection

A total of 103 water samples were collected in July to August 2022 from the Changjiang Estuary and adjacent East China Sea. Environmental factors including depth, salinity, temperature, pH, Chlorophyll a concentration, and DO concentration were obtained through Conductivity-Temperature-Depth (CTD) sensors. About 5 L seawater were pre-filtered using a 20 μm pore size filter, and then filtered through 3 μm and 0.2 μm pore-size polycarbonate membranes (Millipore, USA) for each sample for metagenomic analysis. All filters were collected, quick-frozen in liquid nitrogen and then stored at −80 °C.

DNA extraction, metagenomic sequencing

Total microbial DNA was extracted from the filter membranes using DNeasy Powersoil Kit (MoBio, USA), and its quality and concentration was assessed by QuantusTM Fluorometer (Promega, USA) after 1% agarose gel electrophoresis. The genomic DNA was fragmented into about 500 bp segments using the Covaris M220 (Covaris, USA). Subsequently, libraries were constructed the manufacturer’s protocol with NovaSeq 6000 Reagent Kit (Illumina, USA). Metagenomic sequencing was conducted using Illumina PE150 chemistries on the NovaSeq 6000 at Hanyu Bio-Tech (Shanghai, China).

Metagenomic assembly, gene prediction and annotation, and genome binning

For each metagenomic data, trimmomatic v0.36 was used to quality control for removing adaptor and low-quality reads24. Clean reads were de novo assembled to generate contigs using MEGAHIT v1.1.3 with default parameters25. Genes from contigs were predicted using Prodigal v2.6.3 (-p meta)26. Predicted genes in all samples were combined, and CD-HIT v4.8.1 was used to remove redundant genes (-c 0.95, -aS 0.9) to construct non-redundant gene catelog27. Abundance of each gene in all samples was calculated using Salmon v1.10.128. For genome binning, MAGs are reconstructed from metagenome data as described in the previous study29. Bowtie2 v2.3.230 was used to align clean reads to the contigs with length exceeding 1,500 bp, and then bam file was sorted using SAMtools v.1.731. We used MetaBAT2 v.2.10.2 to calculate the coverage of contigs and carry out binging analysis, employing different parameters of maxP, minS, maxEdges (60, 60, 200; 95, 60, 200; 60, 95, 200; 60, 60, 500; 95, 60, 500; 60, 95, 500; 95, 95, 200; and 95, 95, 500)32. We then used DAS_Tool v.1.0 to dereplicate and aggregate these binning results to construct accurate bins33, and assessed the quality of these MAGs (including completeness, contamination, and strain heterogeneity) using CheckM2 v.1.0.134.

Taxonomic classification and genome tree construction

The taxonomic affiliation of the 1,559 MAGs was determined using GTDB-Tk v2.1.1 with the reference database GTDB r2.1.435,36. GTDB-Tk identified marker genes (120 bacterial and 122 archaeal conserved genes) to infer phylogenetic relationships37. Then the reference trees for 181 archaeal and 1,378 bacterial MAGs were generated using FastTree v2.1.1038. The trees were visualized and annotated using ChiPlot (https://www.chiplot.online/)39.

Data Records

The metagenome raw reads generated in this study haven been deposited and publicly available in the NCBI Sequence Read Archive (SRA) database with the BioProject number PRJNA108158340 and accession numbers SRP49221941. The reconstructed MAGs have been uploaded to NCBI GenBank database with the same project PRJNA108158340 and accession numbers JBERCU000000000-JBERRB000000000, JBERRC000000000-JBESFQ000000000, JBESFR000000000-JBESUD000000000, JBESUE000000000-JBETJF000000000, and JBETJG000000000-JBETKS000000000. MAGs sequences generated, MAGs trees, and non-redundant gene catalog in this study have been deposited at Figshare42.

Technical Validation

The quality of metagenomic datasets and contig assemblies was assessed using QUAST v5.2.043, and the quality of the MAGs (including completeness, contamination, and strain heterogeneity) was assessed using CheckM2 v.1.0.134. The taxonomic assignments of MAGs, such as Roseobacteraceae, Flavobacteriaceae, SAR11, and SAR86, were consistent with typical composition of marine microorganisms in coastal seawater44,45. The novelty of the MAGs was assessed through comparison with OceanDNA46 and Tara Oceans47 using dRep v3.4.248 at 95% average nucleotide identity.