Introduction

Antarctic terrestrial ecosystems are cold, dry and low-nutrient environments, with drastic temperature fluctuations and paradoxically low levels of water availability (Pearce 2008). Due to this harsh climate, plant growth is largely limited to the ice-free areas around the coastal fringe of continental Antarctica, which are generally small and isolated. In these regions, cryptogams (mosses and lichens) and two vascular plant species constitute the predominant (though sparse) terrestrial vegetation (Skotnicki et al. 2005; Bokhorst et al. 2007). Recent studies have shown that these terrestrial plants are of ancient origins and have persisted in evolutionary isolation for tens of millions of years (Convey and Stevens 2007). To successfully colonize these low-temperature environments, they have evolved a number of strategies that range from molecular to whole cell, as well as to ecosystem levels. The process of genetic change that accumulates over a time scale of many generations in response to an organism’s specific environmental niche is termed adaptation (Morgan-Kiss et al. 2006). In contrast to acclimation, intrinsic changes may occur among the functional genes of Antarctic organisms during adaptation. Therefore, these Antarctic plants offer exceptional opportunities for discovering new functional genes and gaining novel insights into the mechanisms of stress resistance, as well as genetic evolution, under extreme conditions.

Mosses have a dominant haploid vegetative stage with stems and leaves, can reproduce both sexually and asexually, and almost any part is capable of regeneration. The moss flora comprises relatively few species of wide ecological amplitude and is widespread across the Antarctic continent (Skotnicki et al. 2000). In the Antarctic environment, moss can be exposed to freezing temperatures (below −7 °C) while in full sunlight (up to 2000 pmol quanta m−2 s−2), particularly in the late summer months when the snow cover has melted (Lovelock et al. 1995). The Chinese Great Wall Station is located in the Fildes Peninsula of King George Island, where the moss is overgrown and luxuriating. During China’s 24th Antarctic expedition, we collected 12 moss samples in the vicinity of the Great Wall Station. These mosses were regularly cultured in an illumination incubator at temperatures below 16 °C and could still flourish at temperatures down to 4–6 °C, suggesting that they have a strong ability to acclimate to low temperatures.

Previously, we constructed and analyzed the library of expressed sequence tags (ESTs) from Arctic moss to gain insight into the transcriptome involved in various cellular processes (Liu et al. 2010). The emergence of next-generation sequencing technology as a cutting edge approach for high-throughput sequence determination has dramatically improved the efficiency and speed of gene discovery (Schuster 2008). An example can been observed with the Illumina sequencing technology, which is able to generate over one billion bases of high-quality DNA sequence per run at less than 1 % of the cost of capillary-based methods (Huang and Marth 2008). Furthermore, this next-generation sequencing technique has also significantly accelerated and improved the sensitivity of gene expression profiling and is expected to boost collaborative and comparative genomics studies (Blow 2009). It has been demonstrated that by performing Illumina sequencing of the transcriptomes of organisms with completed genomes, the relatively short reads produced can be effectively assembled and used for gene discovery and the comparison of gene expression profiles (Rosenkranz et al. 2008). Despite its obvious potential, these next-generation sequencing methods have yet to be applied to Antarctic organisms.

Mosses are the dominate organisms in Antarctic terrestrial ecosystems. The ability of Antarctic mosses to adapt to the extreme polar environment is due to distinct functional genes and different gene expression profiles corresponding to stressed conditions. However, the genomic sequence resources available for Antarctic mosses are scarce (Peck et al. 2005). Therefore, in this study, the transcriptome analysis of the Antarctic moss, Pohlia nutans, in response to cold stress using the Illumina ultra high-throughput sequencing technology for functional gene or gene family discoveries will offer us a better understanding of the mechanism of stress acclimation and provide us with a basis for effective engineering strategies to improve stress tolerance in agricultural crops.

Materials and methods

Plant materials and cold stress treatments

The Antarctic moss Pohlia nutans was collected from the terrene near the Great Wall Station of China in Antarctica (69.8S, 77.8E) during the 24th Antarctic expedition of China in 2008. The samples were placed in plastic containers and transported to China, where they were then cultivated on soil media in an illumination incubator at 16.5 °C and 90 % relative humidity, at a light density of 80 μmol photons m−2 s−1 with continuous light. The mosses cultivated under this condition were used as the experimental control. For the cold treatments, the mosses were transferred to 4 °C and incubated for 6 h. The gametophyte tissues (moss leaves and stems) were frozen immediately with liquid nitrogen. The moss samples were ground in liquid nitrogen with a mortar and pestle, and the ground powder was transferred into 1.5-ml Eppendorf tubes and stored at −80 °C.

RNA extraction and library preparation for transcriptome analysis

The total RNA was extracted from the liquid nitrogen ground moss powder using TRIzol reagent (Invitrogen, Carlsbad, CA, USA), following the manufacturer’s standard protocol. The final total RNA was dissolved in 200 μL of RNase-free water. The concentration of the total RNA was determined by a NanoDrop spectrophotometer (Thermo Scientific, USA), and the RNA integrity value (RIN) was checked using the RNA 6000 Pico LabChip kit on an Agilent 2100 Bioanalyzer (Agilent, USA).

The total RNA was incubated with 10 U of DNase I (Ambion, Austin, TX, USA) at 37 °C for 1 h, and then nuclease-free water was added to bring the sample volume to 250 μL. The messenger RNA was further purified with the MicroPoly(A) Purist™ Kit (Ambion), following the manufacturer’s manual. The mRNA was dissolved in 100 μL of RNA Storage Solution and the final concentration was determined using a NanoDrop spectrophotometer.

The RNA was fragmented at 95 °C for 5 min and annealed with Biotinylated Random Primers which contained the Illumina adapter sequencing primer. The RNA fragments were then captured on Streptavidin Magnetic Beads through interactions with the Biotinylated Random Primers. An additional Illumina adapter sequencing primer was added to the 5′ end of the cleaved mRNA fragments with T4 RNA ligase at 37 °C for 3 h. The cleaved RNA fragments were used for the first-strand cDNA synthesis using reverse transcriptase. The second-strand cDNA was synthesized using DNA polymerase I and RNase H. The products were enriched and purified via PCR to create the final Illumina sequencing cDNA library.

Transcriptome analysis

The Illumina sequencing was performed with the Illumina HiSeq 2000 platform at the Chinese National Human Genome Center (Shanghai), according to the manufacturer’s instructions (Illumina, San Diego, CA, USA). Four fluorescently labeled nucleotides and a specialized polymerase were used to determine the clusters base by base in parallel. The size of the cDNA fragment was approximately 300 bp, and both ends of the cDNA fragment were sequenced. The image deconvolution and quality value calculations were performed using Genome Analyzer SCS V2.7 software. The clean data were obtained from the raw data using the FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/). Any short sequences (<100 bp) were removed. The resulting high-quality sequences were assembled into non-redundant consensus sequences using Velvet and Oases software (Zerbino and Birney 2008; Schulz et al. 2012). The original data sets are available at the NCBI Short Read Archive (SRA) with the accession number SRA051595. This Transcriptome Shotgun Assembly project has been deposited at DDBJ/EMBL/GenBank under the accession GACA00000000. The version described in this paper is the first version, GACA01000000.

The open reading frame was identified using an in-house developed program based on ‘GetORF’ from EMBOSS (Rice et al. 2000). Gene annotation was performed through a BLASTP search against the Swiss-Prot and GenBank databases (E value <10−3), and the best sequence was chosen based on the gene annotation. Gene ontology analysis was performed using GoPipe (Chen et al. 2005), using BLASTP against the Swiss-Prot and TrEMBL databases (E value <10−3). The metabolic pathway was constructed based on the KEGG database using the BBH (bi-directional best hit) method (Kanehisa et al. 2010). First, the analysis assigned a KO number to each protein, and then, the metabolic pathways were constructed based on the KO number.

Identification of the differentially expressed genes

Following the cDNA sequence assembly, the reads for the number of distinct sequences in the cold-treatment group and the control group were derived and used as an approximate estimation of the gene expression levels in the gametophyte tissues of the Antarctic moss, following normalization to the RPKM (Reads Per Kilobase per Million reads) (Mortazavi et al. 2008). The differentially expressed genes between the two samples were identified by the DEGseq package, using the MARS (MA plot-based method with Random Sampling model) method (Wang et al. 2010a, b). A false-discovery rate (FDR) ≤0.001 and the absolute value of log2(Treat/Control) ≥1 were used as the threshold to assess the significance of the differential gene expression. The differentially expressed genes sorted out according to KEGG category term and BLAST annotation.

Identification of receptor-like kinases and transcription factors

The protein sequences of the predicted receptor-like kinases for Arabidopsis thaliana were downloaded from the NCBI. The resulting sequences were saved in FASTA format, and the duplicate sequences were removed. A total of 994 receptor-like kinase sequences for A. thaliana were obtained and then used to BLAST against the normalized transcriptome database of P. nutans (E value <10−3).

The protein sequences of the predicted transcription factors for A. thaliana were downloaded from the Plant Transcription Factor Database (http://planttfdb.cbi.pku.edu.cn/) (Zhang et al. 2011). For identification of the transcription factor-related unigenes from Antarctic moss, the protein sequence set of each predicted A. thaliana transcription factor family was BLASTed against the normalized transcriptome database of P. nutans (E value <10−3).

RT-PCR analyze the expression of several cold stress-related genes

To validate the accumulation of mRNAs under cold stress conditions, total RNA extracts from control plants and plants cold-treated for 1, 3, 6, 12 and 24 h were used as templates and the concentration was measured by microvolume UV–Vis spectrophotometer. RT-PCRs were performed using the PrimeScript™ RT-PCR Kit (TaKaRa, Dalian, China). The ortholog(s) of CRLK1, MEKK1, MKK2, MPK4, CBF and ICE1 of A. thaliana in P. nutans were selected for RT-PCR analysis (Table S1). The moss β-tubulin gene was used to normalize the template. Gene-specific primers were used to amplify cDNA fragments of these selected genes with the program for 28–29 cycles (95 °C for 30 s; 56 °C for 30 s; 72 °C for 45 s) (Table S2). Amplification products were visualized on 1.2 % agarose (w/v) gels containing ethidium bromide, using a UV light transilluminator (BioSpectrum® Imaging System, Ultra-Violet Products Ltd., California, USA).

Results

Illumina sequencing and assembly of the reads

To better understand the molecular mechanisms of the Antarctic moss Pohlia nutans in adapting to an extreme polar environment, we constructed two Illumina cDNA libraries from the cold-treatment samples and control samples. Approximately, 19,574,660 and 17,855,650 94-bp pair end (PE) raw reads from the cold-treatment tissue and control tissue, respectively, were generated using Illumina RNA sequencing analysis. Using the software Velvet and Oases for de novo assembly, all of the reliable reads from the cold-treatment samples and control samples were first assembled into contigs. The resulting contigs were pooled together and then further assembled, resulting in 145,538 scaffolds, including 49,054 scaffolds larger than 500 bp. The scaffold sequences were assembled into clusters that were then analyzed for consensus. Finally, a total of 93,487 non-redundant sequences (unigenes), ranging from 100 to 13,000 bp, were generated (Fig. S1). Following assembly, we analyzed the length distribution of all of the non-redundant sequences. As shown in Fig. S1, although 80 % of the transcripts were between 100 and 500 bp, we still identified 18,688 transcripts longer than 500 bp. These transcripts provide abundant data that will aid in producing a better reference of the stress-relevant genes in plants.

Annotation of the non-redundant sequences

“GetORF” from the EMBOSS software was used to analyze the reliable coding sequences (CDs) of the 93,488 non-redundant sequences and further translate them into functional proteins. Comparison with the GenBank nr database and Swiss-Prot database revealed that 16,781 non-redundant sequences had good comparability with known gene sequences in existing species (E value <10−3). Within these known genes, the average encoded protein size was 275 amino acids. In addition, there were 7,783 non-redundant sequences (length >500 bp) that had no significant similarities to any other protein sequences or were annotated as an unknown protein or putative protein in the public databases.

Annotation of the 93,488 non-redundant sequences using the Gene Ontology (GO) database yielded better results, identifying approximately 19,439 sequences as cellular components, 20,294 sequences involved in molecular functions and 23,659 sequences involved in biological process, and showed the genes to be distributed among more than 50 categories, including metabolism, growth, development, apoptosis, and immune defense (Fig. 1). Similarly, the clusters of eukaryotic orthologous groups of proteins (KOG)-annotated putative proteins were classified functionally into at least 25 molecular families, including cellular structure, biochemistry, metabolism, molecular processing, signal transduction, gene expression, and immune defense, which correspond to the categories observed in the GO analysis (Fig. S2). The results showed that out of the 93,488 non-redundant sequences, 29,828 transcripts had a KOG classification within the 25 categories. Within these categories, the transcripts with unknown function or putative function together (L + M) comprised the largest group of non-redundant sequences, while the transcripts with signal transduction mechanisms were the second largest group.

Fig. 1
figure 1

Histogram presentation of the Gene Ontology classification. The results were summarized into three main categories: biological processes, cellular components and molecular functions. The left y axis indicates the percentage of genes in each specific category

The metabolic pathway of the Antarctic moss non-redundant sequences was constructed based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database using the BBH (bi-directional best hit) method. Among the 93,488 non-redundant sequences, approximately 5,037 transcripts could be grouped into six categories comprising 219-known metabolic or signaling pathways, including cell growth and death, cell communication, development, environmental adaptation, signal transduction, folding, replication and repair (Fig. 2).

Fig. 2
figure 2

The KEGG categories of the non-redundant consensus sequences. All of the non-redundant consensus sequences were annotated using the KEGG Automatic Annotation Server for pathway information, and approximately 5,037 consensus sequences were annotated. The categories GIP and EIP stand for genetic information processing and environmental information processing, respectively

Differential gene expression analysis of the Antarctic moss after cold treatment

The MARS (MA plot-based method with Random Sampling model) method was used to identify the differentially expressed genes, and “an false-discovery rate (FDR) ≤0.001 and an absolute value of log2Ratio ≥1” were used as the threshold to judge the significance of the differential gene expression. Among these genes, 3,796 non-redundant sequences were significantly upregulated, while 1,405 non-redundant sequences were significantly downregulated (Fig. 3). The distribution of the significant changes is illustrated in a volcano plot, where the statistical significance of each transcript is plotted against the fold change (Fig. S3). For each transcript, the ratio of the expression levels in the cold-treatment samples versus that of the control samples was plotted against the −log error rate. The horizontal green line indicates the significance threshold (FDR ≤0.001), and two vertical green lines indicate the twofold change threshold (log2(Treat/Control) ≥1 or log2(Treat/Control) ≤−1). The results showed that several of the transcripts with the higher fold changes between the cold-treatment samples and control samples (far right and left of the plot, respectively) also had the smallest false discovery rate values (−log error rate).

Fig. 3
figure 3

Differential expression analysis of the identified genes. The expression level for each gene was included in the volcano plot. “Not DEGs” indicates “not-detected expression genes.” The x axis represents the Log10 of the transcripts per million of the treated sample, and the y axis indicates Log10 of the transcripts per million of the control sample. The limitations are based on P < 0.001 and the absolute value of log2(Treat/Control) ≥1

Functional annotation showed that the differentially regulated genes were involved in many biochemical and physiological aspects such as signal transduction (receptor-like kinase, mitogen-activated protein kinase family and phosphatase), transcription (AP2, NAC, bHLH, MYB and zinc finger transcription factor), transport (ABC transporter, vacuolar malate transmembrane transporter and tonoplast dicarboxylate transporter, sugar transporter), protein degradation (ubiquitin-protein ligase and F-box family protein), chemical modification (cytochrome P450 monooxygenase, cytochrome P450 reductase and glycosyltransferase). Furthermore, other function genes were also up- or down-regulated after cold stress including the genes of late embryogenesis abundant protein (LEA), S-adenosylmethionine decarboxylase and trehalose 6-phosphate synthase. Cold treatment usually makes the plants suffer oxidative stress and accumulate reactive oxygen species (ROS). Correspondingly, some antioxidant enzymes were up-regulated after cold stress such as thylakoid-bound ascorbate peroxidase, peroxidase and oxidoreductase. Chloroplast omega-6 fatty acid desaturase, microsomal delta-5 desaturase and delta9-fatty acid desaturase were up-regulated, which might be responsible for increasing the membrane unsaturated fatty acid content under cold stress. These representative cold-responsive genes of Antarctic moss P. nutans were sorted out and classified according to BLAST annotation as shown in Table 1.

Table 1 Representative cold-responsive genes of Antarctic moss P. nutans

Identification of receptor-like kinases and transcription factors

BLAST software was used to search the normalized the Antarctic moss transcriptome database and a total of 816 transcripts encoding putative receptor-like kinases from the transcriptome of P. nutans were characterized, using the A. thaliana receptor-like kinase sequences as a reference (E value <10−3). Further annotation showed that these proteins contain subfamilies that include leucine-rich repeat receptor-like kinase, proline-rich receptor kinase, lectin-rich receptor kinase, mitogen-activated protein kinase, and Ca2+ and calmodulin-dependent protein kinase. Differential gene expression analysis showed that there were 42-upregulated and 33-downregulated putative receptor-like kinases in P. nutans after cold treatment (Table 1). These differentially expressed genes accounted for 9.2 % of the total isolated receptor-like kinases.

In the P. nutans transcriptome, 1,309 transcripts encoding putative transcription factors were identified using A. thaliana transcription factor as queries, comprising 1.4 % (1,309/93,487) of the total non-redundant sequences (Fig. S3). The target transcripts were further classified into 52 corresponding transcription factors families. Differential gene expression analysis showed that 106 transcription factors were significantly upregulated, including those in the AP2/ERF (20 transcripts), WRKY (21 transcripts), C2H2 (7 transcripts), NAC (6 transcripts) and bHLH (6 transcripts) families, while 63 transcription factors were significantly downregulated, including the WRKY (13 transcripts), NAC (6 transcripts), C2H2 (6 transcripts) and bHLH (6 transcripts) families (Table 1). These differentially expressed genes accounted for 12.9 % of the total isolated transcription factors.

RT-PCR analyze the expression of several cold stress-related genes in P. nutans

According to the previously published reports, we selected orthologs of CRLK1, MEKK1, MKK2, MPK4 and CBF transcription factors in P. nutans for analyzing their expression levels under cold stress and validating the reliability of Solexa/Illumina sequencing data. The expression levels of these genes in P. nutans transcriptome sequencing and the percent identity with orthologs in A. thaliana were listed in Table S1. RT-PCR results showed that the expression levels of one MEKK1 ortholog (MFC03145), one MKK2 ortholog (MFC88317), three CBF orthologs (MFC40570, MFC14588, MFC00537) and one ICE1 ortholog (MFC41806) were significantly increased after cold stress (Fig. 4). However, the selected CRLK1 and MPK4 orthologs in P. nutans were not increased after cold stress (Fig. S5). These results showed that expressions of these selected fifteen genes were basically consistent between the RT-PCR and the transcriptome differential gene expression analysis.

Fig. 4
figure 4

RT-PCR analysis of MEKK1, MKK2, CBF and ICE1 orthologs in Antarctic moss P. nutans after cold stress. MFC03145 is MEKK1 ortholog; MFC88317 is MKK2 ortholog; MFC40570, MFC14588 and MFC00537 are CBF orthologs; MFC41806 is ICE1 ortholog. β-Tubulin was used to normalize the template

Discussion

Antarctic mosses are capable of growing under extreme climate conditions, which include extreme cold, aridity and a photoperiod that ranges from continuous sunlight in the summer to continuous darkness in the winter (Skotnicki et al. 2000). These plants offer exceptional opportunities for gaining novel insights into the mechanisms of plant survival under extreme conditions (Peck et al. 2005). The transcriptome is the complete repertoire of expressed RNA transcripts in a cell, and its characterization is essential in deciphering the functional complexity of the genome and in obtaining a better understanding of the cellular activities in organisms, including growth, development, immune defense and stress response (Xiang et al. 2010). The Solexa/Illumina RNA-seq deep sequencing approaches have overcome many of the inherent barriers of the traditional EST approach, making the detection of alternative splicing events and low-abundance transcripts possible (Xiang et al. 2010; Wang et al. 2010a, b). Recently, these platforms have been applied to several species, such as the Monterey pine (Pinus radiate), the cucumber (Cucumis sativus L.), sesame (Sesamum indicum L.) and the moss P. patens, for different purposes (Li et al. 2011; Wei et al. 2011; Qi et al. 2012; Xiao et al. 2011).

In this study, transcriptome profile sequencing analysis of cold shock-treated Antarctic mosses was conducted using the Illumina platform. A large data set of 93,488 non-redundant consensus sequences, including 18,688 sequences longer than 500 bp, with complete or partial lengths of encoding regions was generated (Fig. S1). Although a high number of transcripts were short-length reads, which can result in several assembled contigs and singletons for each gene (Hsiao et al. 2011), this isolated consensus sequence dataset will reflect the sophisticated organization of Antarctic moss for favoring growth, reproduction and adaptation to the polar environment. Homology searches revealed that 16,781 consensus sequences had significant similarities with known gene sequences in existing species. In addition, there were 7,783 consensus sequences (length >500 bp) that had no significant similarities to any other protein sequences or were annotated as “unknown protein” or “putative protein” in the public databases. Together, these non-redundant consensus sequences nearly equaled the number of clusters found in the model plant P. patens subsp. patens (Rensing et al. 2008). Subsequently, when we set a higher threshold (sequence length >1000 bp and raw reads >100) to screen out the genes annotated as unknown protein or putative protein, 1,793 genes matched these conditions in the Antarctic moss transcriptome. These novel moss-specific genes may perform specific roles, such as ecological strategies for environmental adaption, and may be quite divergent from those of other plant species.

Comparing the sequencing abundance (raw reads) of the functional genes between the treatment and control samples provided support for the expression levels of the different genes responding to stress. The results showed that the cold treatment resulted in large alterations of the transcriptome profile, including significant upregulation of 3,796 transcripts and downregulation of 1,405 transcripts (Fig. 3). Our data were consistent with the number of differentially expressed genes in moss model organism P. patens that, 4,129 transcripts were identified as differentially expressed or as developmental-stage specific genes by the Illumina digital gene expression tag profiling technique (Xiao et al. 2011). Previously, the proteomic analysis (2-DE and mass spectroscopy) were used to study the cold stress response in the moss, P. patens and 45 changed proteins were identified involved in energy metabolism, signaling, cytoskeleton, and defense proteins. Our results also showed that the cold-responses in Antarctic moss are also multi-aspects including the processes of cold sensing and signal transduction, alterations of metabolism, accumulation of defense proteins (Table S1) (Wang et al. 2009).

Receptor-like kinases play many critical roles in plant development and physiology, including developmental patterning, hormone signaling, disease resistance and stress responses (Lehti-Shiu et al. 2009; Gish and Clark 2011). In Antarctic moss P. nutans transcriptome, differential gene expression analysis showed that nearly 42 receptor-like kinases were upregulated, and 33 receptor-like kinases were downregulated after the cold treatment, including leucine-rich repeat receptor-like kinases, mitogen-activated protein kinases, and Ca2+ and calmodulin-dependent protein kinases (Table 1). Furthermore, a recent study showed that a novel calcium/calmodulin-regulated receptor-like kinase (CRLK1) mainly localized in the plasma membrane might confer cold sensing and signal transduction in A. thaliana (Yang et al. 2010). Using CRLK1 (NP_568809.2) BLASTP search against P. nutans transcriptome protein data, we identified a full-length CRLK1 ortholog in P. nutans (MFC60113). The amino acid sequence deduced from MFC60113 was 57.9 % identity with CRLK1 of A. thaliana and 81.1 %, 83.1 identity with CRLK1 ortholog of P. patens subsp. patens (XP_001753295.1 and XP_001755574.), respectively. However, the expression level of CRLK1 ortholog in P. nutans was not increased after cold stress treatment (Fig. S5, Table S1). In plants, mitogen-activated protein kinase signaling under abiotic stress had been comprehensively studied. Overexpressing MEKK1, MKK2, MPK4 exhibited enhanced tolerance to cold and salt stress (Rodriguez et al. 2010). In our RT-PCR analysis, one MEKK1 (MFC03145) and one MKK2 (MFC88317) genes were significantly up-regulated after cold stress in P. nutans. However, the other selected MEKK1, MKK2 and MPK4 were not changed (Fig. 4; Fig. S5, Table S1).

Transcription factors exhibit sequence-specific DNA binding and are capable of activating or repressing the transcription of downstream target genes. In this study, we identified 1,309 unigenes encoding putative transcription factors, corresponding to 52 subfamilies and comprising 1.4 % (1,309/93,487) of the unigenes of the P. nutans transcriptome (Fig. S3). Compared to the 5.7 % of plant genes that have been shown to be transcription factor genes (Libault et al. 2009), the 1.4 % of genes associated with transcription factors in P. nutans is low. However, in the genome of the model plant P. patens subsp. patens, 1,188 genes were found to encode transcription factors and were classified into 53 subfamilies (Zhang et al. 2011). This finding suggests that the underestimation of transcription factor genes in P. nutans could have been attributed to the relatively short reads per sequencing reaction, therefore, causing the assembled unigenes to be unable to cover the full length of the transcripts or to represent untranslated regions. Nevertheless, the number of transcription factors in P. nutans (1,309) seems to be similar level with those in P. patens (1,188) (Fig. S3). The transcription factors in the CBF family, existing in multiple plant species, are the key regulators of the cold-responsive genes. Constitutive overexpression of either CBF1, CBF2, or CBF3 in A. thaliana results in expression of the CBF regulon and brings about an increase in freezing tolerance without a cold stimulus, indicating that the CBF regulon has a fundamental role in cold acclimation (Zhou et al. 2011). The upstream regulators of the CBF named inducer of CBF expression (ICE) acts as a positive regulator of CBFs. In Antarctic moss P. nutans, three CBF orthologs (MFC40570, MFC14588, MFC00537) and one ICE1 ortholog (MFC41806) were significantly increased after cold stress in RT-PCR analysis (Fig. 4). Therefore, the cold acclimation in Antarctic moss P. nutans may also be mainly mediated by CBF transcription factor-dependent manner.

It was demonstrated that factors such as the accumulation of soluble sugars, unsaturated fatty acid and antioxidants also play important roles in alleviating freezing-induced cellular damage (Sun et al. 2007; Bhyan et al. 2012). Recent study showed that soluble sugars are involved in responses to stress, and act as signaling molecules that activate specific or hormone cross-talk transduction pathways (Ramel et al. 2009). They are distributed through the plant via sugar transporters, which are involved not only in sugar long-distance transport via the loading and the unloading of the conducting complex, but also in sugar allocation into source and sink cells (Afoufa-Bastien et al. 2010). In P. nutans transcriptome, we identified a total of 63 sugar transporter candidates including monosaccharide transporter, glucose transporter and hexose transporter. Differential gene expression analysis showed that nearly 16 sugar transporters were upregulated after cold stress (Table 1). Meanwhile, it has been demonstrated that there is a general tendency in the cells to increase the amount of unsaturated fatty acid lipids to enhance membrane fluidity. Acclimation to low-temperature stress via an increase in expression of desaturases has been documented in bacteria, algae, plants, and animals (Morgan-Kiss et al. 2006). In the present study, a total of 16 unigenes in the Antarctic moss transcriptome were identified as possible members of the fatty acid desaturase family, including omega-3, omega-6, delta 9 and delta-12 desaturase. Differential gene expression analysis showed that 6 fatty acid desaturases were significantly upregulated after cold stress (Table 1). Together, these differentially expressed transcripts of sugar transporter, fatty acid desaturase, oxidoreductase may indicate that sugars, unsaturated fatty acid and antioxidant should also have function in P. nutans cold acclimation.

LEA proteins are accumulated in vegetative plant tissues after exposure to environmental stresses, and these proteins are thought to play a role in freezing tolerance (Hundertmark and Hincha 2008). Cold acclimation in bryophytes is also accompanied by accumulation of several transcripts for LEA proteins (Minami et al. 2005). We identified a total of 21 LEA proteins in P. nutans transcriptome and 9 of them were significantly upregulated after cold stress (Table 1). Polyamines are present in all living organism and implicate in a wide range of cellular physiological processes. Overexpression of carnation S-adenosylmethionine decarboxylase gene generates a broad-spectrum tolerance to abiotic stresses in transgenic tobacco plants (Wi et al. 2006). We also identified two S-adenosylmethionine decarboxylase genes in P. nutans transcriptome and both of them were significantly upregulated after cold stress (Table 1). Moreover, the disaccharide trehalose is considered as a universal stress molecule, protecting cells and biomolecules from injuries imposed by many abiotic stress including freezing, salinity, oxidation and desiccation (Tibbett et al. 2002; Reina-Bueno et al. 2012). Similarly, in P. patens transcriptome, we identified a total of 20 trehalose-phosphate synthase genes and more than 11 of them were upregulated after cold stress, suggesting the important role of trehalose in Antarctic moss acclimatizing the extreme environment (Table 1). Furthermore, cytochrome P450 monooxygenase (hydroxylase) together with cytochrome P450 reductase participate in a variety of biochemical pathways to produce primary and secondary metabolites such as phenylpropanoids, alkaloids, terpenoids, lipids, and glucosinolates, as well as plant hormones (Mizutani and Sato 2011). In model plant P. patens, proteomic analysis found that cytochrome P450 protein and NADH-cytochrome P450 reductase were up-regulated after cold stress. In Antarctic moss P. nutans, we also found that cytochrome P450 protein and NADH-cytochrome P450 reductase are regulated by cold stress (Table 1).

Taken together, these data obtained from annotated or known genes suggested that Antarctic moss adopted the similar biochemical and genetic processes as higher plant in response to cold stress. However, Antarctic moss transcriptome comprised a relatively large proportion of genes annotated as unknown protein or putative protein, and several of these genes were also significantly regulated by cold stress. Therefore, cold acclimation in Antarctic moss displays distinct differences with higher plants, which suggests a significant alteration during the evolution of land plants. Moreover, five up-regulated moss-specific genes in Antarctic moss transcriptome have been further confirmed by Real-Time PCR approach and are being verified by transgenic Arabidopsis. We believe that these results will contribute to clarify the genetic mechanism of Antarctic moss in the processes of acclimatizing and adapting the rigorous climate in Antarctic continent and accelerate practical exploitation of Antarctic moss gene resources in improving the stress tolerance in agricultural crops.