Background

Colorectal cancer (CRC), which is the second most common cause of cancer-related death worldwide, places a major economic burden on global health care systems [1, 2]. The liver is the most frequent site of metastases from colorectal cancer, and approximately 50 % of patients with advanced CRC already suffer from or will develop colorectal liver metastasis (CLM) during the disease course [3].

In the clinic, although CRC patients initially lacking liver metastases undergo R0 resection of the primary tumor, some of these patients ultimately develop CLM [4], whereas other patients never experience CLM, even if all patients receive the same pathologic staging and therapy [5]. Therefore, additional prognostic markers are urgently needed to indicate the potential for CLM at the time of diagnosis and to guide therapeutic decisions in adjuvant settings during even the early stage of CRC. Additionally, the diagnosis of and the therapeutic strategies for CRC relapse and metastasis are primarily based on the tumor-node-metastasis (TNM) staging criteria [6]. However, this method consistently results in misdiagnoses because it ignores the molecular mechanisms underlying cancer progression [7].

In recent years, advances in genome-wide analyses of the eukaryotic transcriptome have revealed that most of the human genome is pervasively transcribed, producing a large quantity of long noncoding RNAs (lncRNAs) mapped to intronic and intergenic regions [8]. The length of lncRNAs ranges from approximately 200 nt to greater than 100 kb [8]. As a newly discovered type of regulatory RNA molecule, lncRNAs may play roles in regulating gene expression in various processes of epigenetics, transcription, posttranscription, and translation during the development of cancer [9]. Increasing evidence has indicated that lncRNAs play important roles in CRC [10], such as HOX transcript antisense RNA (HOTAIR) [11], highly upregulated in liver cancer (HULC) [12], and H19 (imprinted maternally expressed transcript) [13]. LncRNAs were reported to be abnormally expressed in CRC tissues and to be associated with cell proliferation, apoptosis, cell cycle distribution, and metastasis [14, 15]. For instance, CRC-associated transcript 1 (CCAT1) was reported to be highly expressed in CRC, especially in metastatic tissue [16]; low lncRNA-LOC285194 expression was shown to significantly correlate with distant metastases in patients with CRC [17]. However, no map of the extensive and choreographed transcription of lncRNAs in CRC or CLM has been generated. Moreover, patients with metastatic CRC were roughly stratified into CLM and non-CLM subgroups in traditional studies without further subdividing the CLM patient population.

In this study, we profiled the lncRNA expression levels of CRC samples with various liver metastases using a human lncRNA array. The differentially expressed lncRNAs were identified based on comparisons between different tissues. In addition, a quantitative reverse-transcription polymerase chain reaction (qRT-PCR) assay was performed to identify target lncRNAs in expanded CRC samples with various liver metastases.

Methods

Patient samples and study design

The patients were eligible for inclusion if their primary tumors had been resected and if they suffered from histologically confirmed colorectal adenocarcinoma based on the pTNM classification as stage III or IV according to the National Comprehensive Cancer Network guidelines. Other eligibility criteria were age >18 and <75 years and RNA extracted from each sample displaying no evidence of degradation or DNA contamination. The patients were excluded if they had previously been exposed to any targeted therapy, chemotherapy, radiotherapy, or intervention therapy for CRC.

The 264 tissue samples from the patients who met the eligibility criteria were collected at the Zhongshan Hospital of Fudan University between January 2009 and July 2012. This study has been approved by the local ethics committee. All samples were frozen in liquid nitrogen, stored at −80 °C, and allocated into three phases of further analysis in chronological order (Fig. 1).

Fig. 1
figure 1

The study design. CLM colorectal liver metastasis, CRC colorectal cancer, NCR normal colorectum, NLM CRC with no liver metastasis, MLM CRC with metachronous liver metastasis, SLM CRC with synchronous liver metastasis, qRT-PCR quantitative reverse-transcription polymerase chain reaction, and lncRNA long noncoding RNA

  1. (1)

    Discovery phase. A total of 18 CRC tissues and 6 normal colorectal tissues were used for lncRNA microarray analysis. Moreover, the tumor tissues were equally categorized into three groups [18]: nonliver metastasis (NLM, defined as no liver metastases occurring for at least 2 years after the primary diagnosis of CRC), synchronous liver metastasis (SLM, defined as liver metastases occurring within 6 months of the primary diagnosis of CRC), and metachronous liver metastasis (MLM, defined as liver metastases occurring beyond 6 months after the primary diagnosis of CRC). DiffGene analysis (t test and significance analysis of microarrays) was performed to identify the differentially expressed lncRNAs based on four pairwise comparisons (the former defined as the experimental arm and the later defined as the control arm): SLM versus NLM, MLM versus NLM, SLM versus MLM, and CRC tissue versus paired normal tissue from SLM. The differentially expressed lncRNAs that displayed a >1.5-fold change in expression and a P value <0.01 based on at least two pairwise comparisons with the same trend were selected for further examination in the training phase.

  2. (2)

    Training phase. A total of 40 lncRNAs identified via microarray were initially analyzed via qRT-PCR in an independent cohort of paired tumor and normal tissue samples from 30 patients (10 NLM, 10 SLM, and 10 MLM). Only the lncRNAs displaying fold changes in expression that corresponded to the microarray results were selected for further analysis in the validation phase.

  3. (3)

    Validation phase. A total of three novel lncRNAs (termed lncRNA-CLMAT1-3) were identified from the training phase, and lncRNA-CLMAT3 was selected for further analysis in an independent cohort of paired tumor and normal tissue samples from 90 CRC patients with or without liver metastases. The overall survival duration (OS) was calculated from the date of the confirmation of the CRC diagnosis to the date of death resulting from any cause or the date of the most recent follow-up, at which point the data were censored.

lncRNA microarray and qRT-PCR

Total RNA was extracted from frozen tissues using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s protocol. An Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) was used to quantify the RNA and to evaluate its integrity. RNA displaying no evidence of degradation or DNA contamination (as indicated by an RNA integrity value ≥7.0 and a 28S:18S ratio ≥0.7) was processed for further analysis.

The lncRNA expression profiles were obtained using the Glue Grant Human Transcriptome Array, which was manufactured by Affymetrix and Stanford University. This microarray contained probes for 563,097 noncoding RNAs covering 50,783 coding genes, including 4408 lncRNAs, collected from the RefSeq, Ensembl and UCSC Known Genes databases based on human genome assembly hg18. After washing the slides, the arrays were scanned using the GeneChip® Scanner 3000. The raw data were obtained using Command Console Software 3.1 according to the default settings and were processed using Affymetrix Power Tools with robust multi-array analysis for background correction, normalization, and summarization.

The extracted RNA samples were used to synthesize first-strand cDNA using the SuperScript III First-Strand cDNA Synthesis Kit (Invitrogen). qRT-PCR analysis was performed using the Applied Biosystems (ABI) Prism 7900 Real-Time PCR system (ABI, USA), FastStart Universal SYBR Green Master Mix (Roche), and the primers presented in Supplemental Table S1. The normalized relative gene expression levels were calculated according to the standard ΔΔCt method using ABI RQ Manager Software (v1.2). Each qRT-PCR reaction was repeated three separate times, which included technical triplicates in each reaction.

Statistical analyses

The data were expressed as the means ± standard error of the mean. All of the enumerated data were compared using the chi-square test, and comparisons of the continuous data between two groups were analyzed using an independent t test. For comparisons of more than two groups, ANOVA was used, and further comparison between two groups was performed using Bonferroni analysis. Survivor functions were estimated using the Kaplan–Meier method, including the log-rank test and multivariate Cox regression analysis (Wald forward), for the variables that displayed significance at P < 0.05 based on univariate analysis. SPSS software (version 16.0; SPSS, Chicago, IL, USA) was used for the statistical analyses. A P value <0.05 was considered to be significant.

Results

Patient characteristics

No significant differences in the baseline characteristics were observed between the three subgroups (NLM, SLM, and MLM) during the discovery or training phase (Supplemental Tables S2 and S3.). Additionally, no significant differences in the distribution of age, sex, or tumor location or size were observed between the two subgroups during the validation phase (Table 1).

Table 1 Association of lncRNA-CLMAT3 with clinic-pathological characteristics

Discovery phase (lncRNA screening and testing)

To confirm the accuracy of the microarray screen, the expression of five randomly selected lncRNAs was analyzed via qRT-PCR. The results confirmed that the relative mRNA expression levels were highly consistent with those demonstrated by microarray analysis (Supplemental Figure S1).

Hierarchical clustering showed systematic variations in the expression of lncRNAs in the CLM samples (Supplemental Figure S2). Compared with the normal colorectal tissues, a total of 270 lncRNAs displayed differential expression (P < 0.05) in tumor tissues from the SLM patients, including 180 upregulated lncRNAs and 90 downregulated lncRNAs. Compared with the NLM group, 948 (790 upregulated and 158 downregulated) lncRNAs in the SLM group and 320 (273 upregulated and 47 downregulated) lncRNAs in the MLM group were significantly differentially expressed in the tumor tissues. Compared with the tumor tissue from the MLM patients, 466 lncRNAs, including 359 upregulated lncRNAs and 107 downregulated lncRNAs, were significantly differentially expressed (P < 0.05) in the tumor tissue from the SLM patients.

Furthermore, 293 lncRNAs displayed a >1.5-fold change in expression and a P value <0.01 among the four pairwise comparisons; 40 lncRNAs met these criteria in more than one pairwise comparison and were selected for further analysis in the training phase.

Training phase (verification of the microarray results)

Based on qRT-PCR analysis of these 40 lncRNAs in the expanded clinical samples, we found that the fold changes in the expression of three lncRNAs (TR140014124, TR01015341, and TR05005298) were consistent with the microarray results (Fig. 2). The mRNA levels of all three lncRNAs were significantly upregulated in the NLM tissues compared with the normal colorectal tissues and were significantly higher in the SLM tissues than in the NLM tissues. Furthermore, both TR01015341 and TR140014124 showed higher expression in the SLM samples than in the MLM samples (P = 0.048 and P = 0.03, respectively), and significant difference within the three groups (NLM, MLM, and SLM) based on one-way ANOVA (P = 0.02 and P = 0.002, respectively). The above results demonstrated that TR140014124, TR01015341, and TR05005298 were upregulated in the CRC tumor tissue, implying that these three lncRNAs may play a role in CLM progression.

Fig. 2
figure 2

The relative expression levels of TR140014124, TR01015341, and TR05005298 were compared between the CRC and normal tissue samples or between CRC tissues of varying liver metastatic status via qRT-PCR. a, b, e The relative expression levels of TR140014124, TR01015341, and TR05005298, respectively, were significantly higher in the tumor tissues than in the normal tissues. b, d, f The mRNA levels of TR140014124, TR01015341, and TR05005298, respectively, in the SLM and MLM tissues were significantly higher than those in the NLM tissues. *P < 0.05; **P < 0.01. CRC colorectal cancer, NLM CRC with no liver metastasis, MLM CRC with metachronous liver metastasis, and SLM CRC with synchronous liver metastasis

The TR140014124, TR01015341, and TR05005298 sequences were analyzed using the NCBI (http://www.ncbi.nlm.nih.gov/) and UCSC (http://www.genome.ucsc.edu/) blast genome databases. In addition, a relevant lncRNA database was searched for further analysis. All of these searches indicated that these three lncRNA sequences have not been officially reported previously (data not shown). Considering that these three lncRNAs were identified for the first time in CLM tissue samples via microarray screening, we termed these genes lncRNA-CLM-associated transcript 1–3 (CLMAT1-3) according to the method used to name previous genes, such as metastasis-associated lung adenocarcinoma transcript 1 (MALAT-1) [19] and CCAT-1 [16].

Validation phase (clinical value of the lncRNA-CLMATs)

We focused on lncRNA-CLMAT3, located on human chromosome 14 (chr14:101379770-101381326, hg19), because it was the most strongly upregulated lncRNA in the CRC tissues compared with the normal tissues (Fig. 2), displaying a tumor tissue/normal tissue expression ratio of 2.21 and 2.10 in the SLM and MLM groups, respectively.

To assess the correlation of lncRNA-CLMAT3 expression with the clinic-pathological characteristics, lncRNA-CLMAT3 expression in 90 CRC and normal colorectal tissues was further analyzed via qRT-PCR. The levels of lncRNA-CLMAT3 in the cancerous tissues were 2.26-fold higher than those in the normal colorectal tissues (P < 0.05, Fig. 3a). In addition, lncRNA-CLMAT3 expression was significantly higher in the patients with liver metastasis than in those without liver metastasis and in the patients with lymph node metastasis than in those without lymph node metastasis (Fig. 3b, c). According to the lncRNA-CLMAT3/glyceraldehyde 3-phosphate dehydrogenase (GAPDH) expression ratio in the cancerous tissues, these cases were stratified into a high lncRNA-CLMAT3 expression group (n = 45) and a low lncRNA-CLMAT3 expression group (n = 45). We found that high lncRNA-CLMAT3 expression was significantly associated with liver metastasis and lymph node metastasis. These results indicated that high lncRNA-CLMAT3 expression is related to CRC progression (Table 1).

Fig. 3
figure 3

The lncRNA-CLMAT3 expression levels in 90 paired CRC and normal tissue samples were analyzed via a qRT-PCR assay. a The expression of lncRNA-CLMAT3 was significantly higher in the tumor tissues than that in the normal tissues. b The lncRNA-CLMAT3 expression levels were significantly higher in the tumors with liver metastasis than in those without liver metastasis. c The expression of lncRNA-CLMAT3 was significantly higher in the lymph node metastasis samples than in samples without lymph node metastasis. d The patients displaying low lncRNA-CLMAT3 expression exhibited longer OS than those displaying high lncRNA-CLMAT3 expressions. *P < 0.05; **P < 0.01. LM liver metastasis and lncRNA-CLMAT3 colorectal liver metastasis-associated transcript 3

We further investigated the association of lncRNA-CLMAT3 expression with OS in patients to evaluate its prognostic value. The median OS (MST) for all patients was 35.2 months (95 % confidence interval 31.0 ~ 39.4 months). The patients with tumors displaying high lncRNA-CLMAT3 expression exhibited shorter OS than the patients with tumors displaying low lncRNA-CLMAT3 expression (Fig. 3d, MST 30.7 vs. 35.2 months, P = 0.007). Furthermore, univariate analysis revealed that liver metastasis (P = 0.012) and lncRNA-CLMAT3 expression (P = 0.008) were prognostic indicators of disease-specific survival. After including the variables (i.e., lncRNA-CLMAT3 high/low expression, gender, age, lesion location, tumor size, histological grade, tumor invasion, lymph node metastasis, CEA, liver metastasis, and postoperative therapy) in the Cox regression model, multivariate analysis showed that lncRNA-CLMAT3 expression (P = 0.02) and liver metastasis (P = 0.04) were independent prognostic indicators of survival for patients with CRC (Table 2).

Table 2 Multivariate analysis of the clinic-pathological characteristics associated with overall survival duration in the validation phase

Discussion

Although there have been multiple studies of the mRNA or microRNA transcriptome in CRC or CLM [20], this is the first study that describes the expression profiles of human lncRNAs in CLM based on microarray analysis. Considering that SLM confers a significantly poorer prognosis than MLM [18], we further subdivided the CLM patients into NLM, MLM, and SLM patients, which helped to identify more meaningful lncRNAs as biomarkers for CLM. Additionally, as demonstrated in the discovery phase of this study, the comparison between the SLM and MLM groups revealed significantly fewer differentially expressed lncRNAs than the comparison between the SLM and NLM subgroups (466 vs. 948, P < 0.001). This finding indicated a relationship between the MLM and SLM subgroups regarding the primary tumor at a molecular level.

In the examination of potential lncRNAs as molecular markers for cancer, traditional studies have been limited to the cellular level or have depended on isolated tissue verification. This study took full advantage of the preliminary sample bank and allocated the tissue samples to three phases in chronological order, from lncRNA screening to target lncRNA verification, which paved the way for the analysis of CLM-associated lncRNAs. Finally, we are particularly interested in lncRNA-CLMAT3 because its over-expression displays clinical value. To our knowledge, this is the first report on the dysregulated expression pattern of lncRNA-CLMAT3 in CRC. More importantly, we found that the lncRNA-CLMAT3 expression level was significantly associated with liver metastasis and survival among patients with CRC. Specifically, patients with high lncRNA-CLMAT3 expression levels exhibited poor prognosis.

lncRNAs may function as tumor suppressors and oncogenes in cancer in a manner that is identical to protein-coding genes [21, 22]. In the present study, we found that CRC patients displaying high expression levels of lncRNA-CLMAT3 more frequently developed liver metastases than those with low lncRNA-CLMAT3 expression levels, indicating that lncRNA-CLMAT3 may act as an oncogene.

Emerging data strongly demonstrate that lncRNAs promote cancer progression primarily by regulating protein-coding genes [23]. For example, the downregulation of lncRNA-MALAT-1 attenuated Wnt/β-catenin signaling, thereby inhibiting CRC invasion and metastasis [24]; LncRNA-CCAT2, reported to promote CRC cell growth and metastasis, may regulate Myc in CRC pathogenesis [25]. Moreover, some lncRNAs could function via the regulation of neighboring protein-coding genes (e.g., lncRNA-Evf-2 transcribed from the ultra-conserved Dlx-5/6 region functions as a transcriptional coactivator of Dlx-2 [26]). Based on analysis using the UCSC genome database (hg19), we further found that there was one protein-coding gene, secreted protein acidic and rich in cysteine (SPARC) [27], located in the antisense strand adjacent to lncRNA-CLMAT3 (Supplemental Figure S3). It was reported that SPARC is over-expressed in highly metastatic tumors (e.g., endometrial cancer, melanoma, glioblastomas, prostate cancer, and breast cancer) and promotes tumor metastasis [2729]. Thus, in combination with the results of our study, these findings led to our hypothesis that lncRNA-CLMAT3 may regulate SPARC expression at the levels of transcription, chromatin modification, and posttranscriptional processing, to participate in CRC progression or metastasis. Future experiments are required to elucidate the exact mechanisms by which lncRNA-CLMAT3 affects CLM by regulating SPARC expression.

Our study contains other limitations. First, the number of tissues analyzed was limited (especially for the MLM subgroup), and the follow-up duration was short. Therefore, the results of certain subgroup analyses are unconfirmed, and the 5-year overall survival rate has yet to be determined. Second, all tissue samples were collected from one treatment center and the corresponding patients were limited to the Han nationality, which might have resulted in a minimal study bias because of the potentially distinct expression of lncRNAs in tissue from individuals of different nationalities. Further in vitro and in vivo experiments are currently being performed by our group to investigate the effect of lncRNA-CLMAT3 on the liver metastasis of CRC.

Conclusions

We identified one novel dysregulated lncRNA, termed CLMAT3, in tissue samples from various CLM subtypes based on a microarray screen and step-by-step qRT-PCR verification. Our study demonstrated that high expression of lncRNA-CLMAT3 was significantly associated with liver metastasis of CRC and was an independent prognostic indicator of survival for patients with CRC.

CCAT-1, colon cancer-associated transcript-1; CLM, colorectal liver metastasis; CLMAT3, colorectal liver metastasis-associated transcript 3; CRC, colorectal cancer; HULC, highly upregulated in liver cancer; LncRNA, long noncoding RNA; MALAT-1, metastasis-associated lung adenocarcinoma transcript 1; MLM, metachronous liver metastasis; NLM, non-liver metastasis; qRT-PCR, quantitative reverse-transcription polymerase chain reaction; SLM, synchronous liver metastasis; SPARC, secreted protein acidic and rich in cysteine; TNM, tumor-node-metastasis