Introduction

The Qinghai-Tibetan Plateau is the largest permafrost in China [1] and seasonal fluctuation by freezing and melting [2]. The light and precipitation are abundant in summer (from May to September) and the temperature is low in winter with low decomposition and ecosystem respiration, which has created a unique and complex ecosystem [3]. Microorganism, as producers and decomposers, is one of the pivotal compositions of microbial food chains and plays a significant role in ecology and biogeochemistry of permafrost ecosystem. Many key life processes, including nitrogen, sulfur, and phosphorus metabolism in the soil, may be regulated by the local bacteria. Understanding bacterial molecular profile will be helpful for explicating the vital physiological function in certain ecosystem.

The genus Qipengyuania, belonging to the family Erythrobacteraceae and the class Alphaproteobacteria, contain only one species Qipengyuania sediminis which is also the type species of the genus [1]. The type strain was isolated from a borehole sediment sample collected from Qiangtang Basin in Qinghai-Tibetan Plateau, China [1]. The study with respect to the ecological roles of bacteria belonging to the genus of Qipengyuania has not been reported previously.

Currently, strain CGMCC 1.12928T is the only strain affiliated to the species Q. sediminis (with 16S rRNA gene similarity > 97%) via searching the NCBI database. However, some candidate Q. sediminis strains from different habitats, such as soil, dolomite rock, coal bed, air, sediment, drinking water [4], and gastrointestinal specimens [5], were found via cultured-independent approach. Interestingly, these strains were from harsh environments, such as biological soil crust of copper mine tailings wastelands, soil contaminated with anthracene [6], and heavy metal contaminated estuarine sediment. In this study, the genome of Q. sediminis strain CGMCC 1.12928T, the type strains of the type species of the genus Qipengyuania is reported and its ecological roles was analyzed.

Organism Information

Strain CGMCC 1.12928T was purchased from the China General Microbiological Culture Collection Center (CGMCC). Cells are Gram-reaction-negative, facultatively aerobic, and chemoheterotrophic [1]. Strain CGMCC 1.12928T is cultivated routinely in Marine Broth 2216 (MB, Difco) at 37 °C. By convention, the strain could be preserved at −80 °C in MB supplemented with 30% (v/v) glycerol. General features of strain CGMCC 1.12928T are summarized in Table 1.

Table 1 Classification and general features of Q. sediminis CGMCC 1.12928T according to the MIGS recommendations

Genome Sequencing, Assembly, and Annotation

Strain CGMCC 1.12928T was cultured in R2A at 30 °C. When bacteria reached the late exponential phase (about 48 h), the cells were collected by centrifuging at 10,000×g for 10 min. High-quality of genomic DNA was extracted using AxyPrepTM Bacterial Genomic DNA Miniprep Kit (Axygen®, Corning). The quality and concentration of the total DNA was measured by Nanodrop (ND-2000, Thermo Scientific™). The draft genome was sequenced using paired-end sequencing technology with Illumina HiSeq-PE150 platform (Novogene Bioinformatics Technology Co. Ltd, Beijing). The sequencing generated approximate 1.0 G bytes clean data with genome coverage of 280X. The de novo assembly of the reads was implemented by ABySS 1.5.2 [7] and one contig of the genome was obtained. PCR amplification were carried out with a designed primer pair (Qse-F: ACTATGAGACCAGGACGC, Qse-R: TTCGGCAGGTGAGTTGT) and PrimeSTAR GXL DNA polymerase (TaKaRa, Dalian, PR China). The PCR product was sequenced via Sanger sequencing and the obtained sequence could overlap the front and rear of the initial contig.

The ribosome RNA (rRNA) genes and transfer RNA (tRNA) genes were identified by RNAmmer version 1.2 server (http://www.cbs.dtu.dk/services/RNAmmer/) [8] and tRNAscan-SE version 1.3 (http://lowelab.ucsc.edu/tRNAscan-SE/) [9], respectively. The open reading frames (ORFs) and functional annotation of translated ORFS were performed using Rapid Annotation using Subsystem Technology (RAST) server online [10]. Functional genes of interest were categorized via SEED subsystems. Gene functional categories were predicted using Clusters of Orthologous Groups of proteins (COG) database [11] to elaborate their putative function. The number of genes with signal peptides were performed using SignalP version 4.1 server (http://www.cbs.dtu.dk/services/SignalP/) [12]. The number of genes with transmembrane helices were identified by TMHMM server v.2.0 (http://www.cbs.dtu.dk/services/TMHMM/) [13]. Metabolic pathway were predicted by Kyoto Encyclopedia of Genes and Genomes (KEGG) automatic annotation server with BBH method [14]. CRISPR (clustered regularly interspaced short palindromic) repeats structures were searched by CRISPRfinder (http://crispr.i2bc.paris-saclay.fr/Server/) [15]. Circular map of strain CGMCC 1.12928T was visualized using CGView [16].

Results and Discussion

The complete genome of strain CGMCC 1.12928T consist of a single-circular chromosome (Fig. 1), comprising 2,416,000 bp with an average G + C content of 66.7 mol%, which is a bit lower than the value reported before (73.7 mol%) [1]. The complete genome of strain CGMCC 1.12928T contain 2414 genes, including 2367 CDSs, 44 tRNA genes, as well as one operon of 16S-23S-5S rRNA genes. According to the result of RAST server online annotation, 1927 CDSs (79.4% of the total CDSs) could be allocated to COG database. The main COG categories of strain CGMCC 1.12928T were category E (11.2%), category C (10.7%), category J (10.2%), category L (9.9%), category K (9.6%), category G (9.5%), and category P (8.8%). In addition, 1198 CDSs (50.6%) were allocated to KEGG database, and metabolic pathways were the predominant pathways. The genome features and COG classification are summarized in Tables 2 and 3, respectively.

Fig. 1
figure 1

Circular map of the chromosome in the genome of Q. sediminis CGMCC 1.12928T (Color figure online)

Table 2 Genome feature of Q. sediminis CGMCC 1.12928T
Table 3 COG classification of Q. sediminis CGMCC 1.12928T

Currently, 27 genomes of the Erythrobacteraece species are available. Here, we compare the genome of strain CGMCC 1.12928T with other Erythrobacteraceae strains (Table S1). The genome size of strain CGMCC 1.12928T (2.4 Mb) is smaller than that of other Erythrobacteraceae strains (2.59–4.11 Mb). In addition, the G + C content of the strain CGMCC 1.12928T (66.7 mol%) is higher than that of other Erythrobacteraceae strains except for Altererythrobacter sp. NS1 (67.0 mol %), Erythrobacter sp. HL-111 (68.1 mol%) and Porphyrobacter sp. CACIAM 03H1 (67.6 mol%).

Biogeochemical relevant genes were found in the genome of strain CGMCC 1.12928T leading to the understanding of its role in matter-cycle, especially the utilizing of nitrogen, sulfur, and phosphorus.

  1. (i)

    Nitrogen Metabolism. The ammonium, nitrate, or nitrite transporters were not detected in the genome of strain CGMCC 1.12928T, indicating it could not transport ammonium, nitrate, and nitrite from extracellular to intracellular. However, the presence of amino acids permeases and ABC transporters (leucine/isoleucine/valine) indicated that strain CGMCC 1.12928T may utilize amino acids as a source of organic nitrogen. The nitrate reductase and the nitrite reductase genes in nitrate/nitrite assimilation were not detected, which is consistent with the previous report that strain CGMCC 1.12928T is negative for nitrate reduction [1]. The urease, encoded by ureA, ureB, and ureC genes, is absent in the genome of strain CGMCC 1.12928T, indicating that it is unable to utilize urea as a source of inorganic nitrogen [17].

  2. (ii)

    Sulfur Metabolism. Genes encoding assimilatory sulfate reduction and cysteine biosynthesis are annotated in the genome of strain CGMCC 1.12928T, including sulfate adenylyltransferase (cysND), adenylylsulfate kinase (cysC), phosphoadenosine phosphosulfate reductase (cysH), NADPH‐dependent sulfite reductase (cysJI), and cysteine synthase (cysK), indicating the incorporation of inorganic sulfur into organic compound (cysteine). Oppositely, genome of strain CGMCC 1.12928T lack the transporter genes responsible for alkanesulfonates acquisition from extracellular and the genes involved in alkanesulfonate assimilation, indicating its unable to utilize organic sulfur.

  3. (iii)

    Phosphorus Metabolism. The positive activities of alkaline phosphatase were confirmed in strain CGMCC 1.12928T by using API ZYM (bioMérieux) [1]. One alkaline phosphatase gene was also annotated. The genome of strain CGMCC 1.12928T possesses a high-affinity phosphate acquisition system (pstSCAB) and regulatory system (phoUBR) related to the phosphorus cycle, mediating inorganic phosphate acquisition to ensure the sufficient inorganic phosphate supply [18] under reduced phosphate conditions [19]. The genome of strain CGMCC 1.12928T harbors genes encoding the polyphosphate kinase (pkk) with the ability of synthesis and utilization of inorganic polyphosphate [20]. The type of polyphosphate kinase is PKK2, using polyphosphate as a substrate to generate GTP from GDP with higher polyphosphate utilization activity. The concentration of phosphorus source could be regulated by the PKK by synthesizing or catalyzing polyphosphate. However, the genome lacks of genes for transport (phnCDE) and cleavage (phnGHIJKLN) of organic phosphate [21]. These features indicated strain CGMCC 1.12928T may prefer to utilize inorganic phosphate than organic phosphate [22, 23].

Beside the element metabolism, the genome of strain CGMCC 1.12928T was mined to analyze potential genes involved in its cold adaptation, which including three functional genes. The two-component regulatory system, comprising sensor histidine kinases and response regulator proteins, play a vital role in cold adaptations [24] by detecting and adapting to their extra- or the intracellular environment changes [25]. Temperature stimuli induce differential expression of two-component regulatory systems in bacteria. The genome of strain CGMCC 1.12928T contained four copies of two-component system sensor histidine kinase and one copy of response regulator protein. In addition, seven copies of histidine kinase were predicted in the genome. Previous studies reported the pigment role in improving bacterial survival at cold environment [26]. Strain CGMCC 1.12928T does not have violacein-encoding gene cluster (vioABCDE) which leading to the lack of violet pigment. Oppositely, strain CGMCC 1.12928T possess the carotenoid-like pigments (yellow) pigment [1] and the relevant genes involved in carotenoid biosynthesis pathway were detected, including one copy of phytoene synthase gene, phytoene desaturase and lycopene beta-cyclase were observed. The presence of these genes may maintain the homeostasis and promote the adaptability when temperature changed [27]. The genome of strain CGMCC 1.12928T harbor genes responsible for DNA repair genes (recN, recO, radA, mutS, and deoxyribodipyrimidine photolyase) and chaperone genes (cbpA, dnaK, dnaJ, clpB, ecmE, htrA), which could be induced in bacteria associated with rapid decrease of growth temperature [25]. It has been reported that low temperatures could promote the expression of the groEL gene in cyanobacterial strain [28]. Similarly, strain CGMCC 1.12928T harbors one copy of molecular chaperone (groEL gene) and cochaperonin protein complex (groES gene) which may participate in the cold adaption.

Conclusions

The complete genome of strain CGMCC 1.12928T contains a circular chromosome. Genomic properties indicated that strain CGMCC 1.12928T has small genome size and a high G + C content within the family Erythrobacteraceae. In addition, genomic analysis reveals that strain CGMCC 1.12928T contains multiple function genes responsible for nitrogen, sulfur, and phosphorus cycles. These findings will improve our understanding of the ecological adaption and response behavior of the genus Qipengyuania. In the future, the complete genome sequence of this strain could capacitate further research of the molecular mechanisms of its survive strategy, and may improve the comprehending of this bacterium biogeochemical role in high-attitude environment.

Nucleotide Sequence Accession Number

The complete genome sequence of Q. sediminis CGMCC 1.12928T was deposited in DDBJ/EMBL/GenBank under the accession numbers CP037948.