Introduction

Gene expression rules dictate that DNA is converted to RNA which is then translated into proteins. This accounts for only 2% of the RNA that is synthesized by the cell [8, 12]. The discovery of small (snRNAs) and long non-coding RNAs (lncRNAs) in addition to the already known structural RNAs that are not translated into proteins challenged this central dogma of gene expression. The lncRNAs constitute a broad heterogeneous group of regulatory RNAs that participate in diverse cellular functions such as epigenetic modulation, transcriptional and post transcriptional regulation. They have an inherent capability to fold into complex secondary structures and associate with various RNA binding proteins on one hand and their pair with complementary DNA/RNA strands on the other. They are crucial for proper embryonic development and also function in adult life [43]. Few early lncRNAs were found by genetic studies in imprinting and dosage compensation (XIST, H19 lncRNAs) [4]. The advent of large scale genomic studies laid the foundation and rules for unearthing novel noncoding transcripts leading to their classification into subtypes [10]. Observations that several lncRNAs were highly dysregulated in human diseases spurred research into analyzing their biochemical and molecular functions in detail. In recent years their importance as potential diagnostic markers and targets for therapy has gained prominence [58]. Though functions of most lncRNAs remain elusive, some representative lncRNAs like HOTAIR ((HOX Transcript Antisense Intergenic RNA) are beacons for understanding lncRNA action in both development and disease. This review focuses on the known HOTAIR functions, its own regulation and its expression and impact in cancer.

The multiple functions of HOTAIR

Epigenetic reprogramming of the chromatin-HOTAIR as a scaffolding and guide RNA

HOTAIR was discovered while studying the transcriptional regulation of HOX genes in pure populations of primary human fibroblasts using ultra-high density HOX loci tiling arrays [45]. It is a 2158 nucleotide long, spliced and polyadenylated transcript containing 6 exons. HOTAIR RNA sequence is highly conserved among vertebrates but the amino acid sequence showed scarce conservation, implying that HOTAIR may function as a lncRNA and not a protein coding template. It is transcribed from the strand antisense to HOXC gene cluster, from an intergenic genomic stretch of DNA flanked by HOXC11 and HOX12 on chromosome 12q13.13. In humans, there are four HOX loci–HOX A, B, C and D which are organized in clusters ordered on the basis of their expression on four different chromosomes [27]. HOX genes encode transcription factors that regulate body segmental patterning. Their precise temporal and spatial gene expression is under tight epigenetic control and regulation by neighboring HOX ncRNAs. These ncRNAs tend to suppress transcription and silence gene expression. While other HOX ncRNAs can epigenetically remodel neighboring HOX genes in cis on the same chromosome, HOTAIR acts in trans to transcriptionally silence the HOXD cluster on chromosome 2, instead of silencing its neighboring genes on chromosome 12. The genes silenced include HOXD8, HOXD9, HOXD10, and HOXD11 (Fig. 1).

Fig. 1
figure 1

The HOX cluster and its regulation by HOTAIR. Horizontal arrows indicate genes expressed in the cluster. HOTAIR is expressed as an antisense transcript and shown by a reverse arrow. The ‘X’ denotes gene silencing

The HOTAIR silencing mechanism involved multiple epigenetic modulators that change histone methylation patterns on HOXD clusters. The 5′ region of HOTAIR (1–300) bound Enhancer of Zeste homolog (EZH2) that is a methylase and a member of the PRC2 complex comprising of EED and SUZ12 [23, 38, 47]. Using the 3′ end, HOTAIR binds LSD1-CoREST-REST complex (1500–2146 bp) [57]. Binding to the PRC2 complex increased H3K27me3 marks around the target gene loci and led to epigenetic silencing of these genes [45]. LSD1-CoREST and REST complex was responsible for demethylating H3K4me2 which is an activation mark [2, 48, 49]. Overall, HOTAIR acts as a structural scaffold bridging the link between PRC2 and LSD1 complexes to form a higher order structure and serves as a guide for these protein complexes to target genes for gene silencing. More recently, genome wide mapping of HOTAIR binding sites using Chromatin immunoprecipitation by RNA purification (ChIRP) revealed that HOTAIR functioned beyond suppression of HOX clusters and bound in the vicinity of other target gene loci. Thus, HOTAIR acts as a guide RNA to assemble transcription regulatory complexes. This gene silencing extends to the expression of miRNAs too, as seen in its action on miR-130a [37].

Combining chemical probing with high-throughput sequencing technologies showed that HOTAIR was a highly structured RNA encompassing four independent modular domains [50]. A modular structure usually indicates that HOTAIR may bind multiple proteins independently or cooperatively via different domains to form higher order complexes. Recently, bisulfite sequencing discovered a methylated ‘C’ at position 1683 (C1683), in the vicinity of the LSD1-binding site of HOTAIR RNA [1]. It is predicted that such post-transcriptional RNA modification(s) may also contribute to the higher order folding and protein binding potential of this lncRNA. Such evidences suggest that HOTAIR may function beyond epigenetic reprogramming.

HOTAIR and protein degradation

Though the primary and ubiquitous function of HOTAIR appears to be gene silencing, HOTAIR was recently shown to interact with E3 ubiquitin ligase suggesting an extranuclear role for this lncRNA in protein degradation [69]. Two E3 ubiquitin ligases with RNA-binding domains, Dzip3 and Mex3b along with their respective targets Ataxin-1 and Snurportin-1 interacted with HOTAIR. While Dzip3 and its target Ataxin-1 bound to ~1028–1272 bp, Mex3b interacted with ~125–250 bp and its target Snurportin-1 bound two regions (~342–471 bp and ~1142–1272 bp) of HOTAIR RNA. Here HOTAIR binding led to increased target protein degradation. In contrast, HOTAIR (1–360 bp) increased androgen receptor (AR) stability by binding its N-terminal domain, competing out MDM2 and E3 ubiquitin ligase and hindering AR degradation [70]. These examples show the plasticity of HOTAIR function in both the nucleus and the cytoplasm. The functions are based on the nature of interaction complexes formed and differ with differing constituent proteins.

HOTAIR as a competitive endogenous RNA

LncRNAs function as competitive endogenous RNAs (ceRNAs) and they impair miRNA function by mimicking miRNA targets and serving as a miRNA sponge [55]. HOTAIR interferes with several miRNAs, and these processes are mostly studied in cancer cells and tumors (Table 1). Some miRNAs like miR-141- also regulate HOTAIR expression or target HOTAIR to RISC complex for subsequent degradation by Ago2 induced cleavage [15]. HOTAIR bound HuR (human antigen R) which is a ubiquitous RNA binding protein that interacts with AU-rich elements in 3′UTRs of mRNAs, recruited the let7 miRNA-Ago2 complex leading to mi-RNA mediated silencing of HOTAIR function [9, 69]. Thus, miRNAs with sequence complementarity with HOTAIR or proteins that bind HOTAIR can inter-suppress each other’s function adding another layer of complexity in regulation of HOTAIR activity and function.

Table 1 Inter-regulation of miRNA and HOTAIR

HOTAIR and cancer

HOTAIR plays crucial roles in initiation and progression of multiple human cancers including breast, gastric, pancreatic, hepatocellular, lung, colorectal and ovarian cancer [22]. Interestingly, its expression is highly correlated to clinical outcome and patient survival promoting its value as a potential prognostic and predictive biomarker in cancer. However, HOTAIR appears to promote different processes like tumor growth, lymph node metastasis, invasion and migration, epithelial to mesenchymal transition, acquisition of stemness via different pathways depending upon the cancer type. Such pathways are listed in Table 2. Though the cancer phenotypes promoted in every cancer appear distinct, they occur largely due to HOTAIR’s ability to reprogram genome-wide PRC2 occupancy mediating gene silencing of tumor and metastasis suppressor genes and concomitant induction of oncogenic pathways, often via the action of EZH2 [21, 72]. Before HOTAIR can be targeted for cancer therapy, it is necessary to understand its effects in cancer. Below we describe HOTAIR functions with special emphasis on breast cancer since it is widely studied in this area.

Table 2 HOTAIR pathways in cancer

HOTAIR and breast cancer

In breast cancer, high HOTAIR expression is tied to initiation/establishment of cancer due to its oncogenic properties whereas its effects on progression rely on promoting cellular invasion and metastasis. Gupta et al. [21] analyzed breast cancer samples for HOTAIR expression and showed that high expression significantly predicted metastasis and death. In breast cancer cell lines, overexpression of HOTAIR led to increased soft agar colony formation, and its downregulation resulted in decreased invasiveness of MDA MB 231 cells in vitro. HOTAIR induced mesenchymal markers such as vimentin and fibronectin, and suppressed epithelial markers such as E-cadherin and members of the protocadherin (PCDH) gene family. PCDH genes are well characterized metastasis suppressor genes that limit cellular movement and invasive behavior. Their suppression by HOTAIR allows epithelial mesenchymal transition (EMT) leading to invasion and metastasis. Experiments with engraftment of these cells with recombinant expression of HOTAIR in mammary fat pads resulted in their dissemination to the lung within 2 weeks of implantation, demonstrating the direct capability of HOTAIR to promote breast cancer metastasis [21].

However, in another study that correlated methylation patterns and HOTAIR expression in 348 patients, high HOTAIR expression appeared to correlate with low risk of relapse and death than in patients with low expression [36]. DNA methylation levels on the other hand correlated well with HOTAIR expression. Recently, Sorensen et al. studied HOTAIR expression in 164 primary tumors with respect to patient survival. Similar to the study by Gupta et al., high expression strongly associated with worse prognosis particularly in ER+ but not ER− tumors (p = 0.0086, HR 1.985) [51]. The presence of multiple ER binding sites participating in the hormonal regulation of HOTAIR (described below) could explain this association of high HOTAIR expression and prognosis in ER+ tumors. An in situ hybridization study for HOTAIR expression along with immunohistochemical localization of EZH2 showed that HOTAIR and EZH2 were positively correlated. While high HOTAIR expression correlated with ER+ tumors, high EZH2 correlated with increased proliferation rate, low HER2+ and ER negativity [14]. The same study showed that in matched primary and metastatic tissues, EZH2 and HOTAIR strongly correlated with metastasis. A recent meta-analysis study using data from 748 patients from 8 studies spanning various cancers by Cai et al. [11] demonstrated that high expression of HOTAIR significantly correlated with incidence of lymph node metastasis (HR 2.81, 95% confidence interval) as compared to patients with low expression. These data suggest that this ability of HOTAIR is not restricted to breast cancers alone. Together, most studies validate HOTAIR as a biomarker of significant prognostic value and its high expression appears to associate with increased metastasis and death in most cancers. In other cancers, HOTAIR prevented PRC2 occupancy at metastasis promoter genes, such as ABL2, SNAIL, laminins, TWIST, β-catenin, VEGF and MMP-9 leading to their expression and epithelial to mesenchymal transition of cancer cells [19, 21, 25].

Interestingly, the breast cancer susceptibility gene (BRCA1), a classical tumor suppressor also interacts with HOTAIR to maintain cellular differentiation in both embryonic stem cells as well as breast cancer cells. It decreases the interaction of HOTAIR and EZH2 in breast cancer cells, since BRCA1 and HOTAIR share a common binding site on the EZH2 protein in the ncRNA binding domain 1 (ncRBD1). Intact BRCA1 preferentially forms a complex with HOTAIR [60]. Hence, the loss of BRCA1 in breast cancer leads to increased amounts of free HOTAIR which can now couple with EZH2 and alter genome wide PRC2 occupancy favoring tumorigenesis. Therefore, maintenance of the level of HOTAIR RNA within a cell appears to be critical for maintaining ‘normal’ cells.

HOTAIR and treatment of breast cancer

Standard cancer treatment in the breast uses multiple modes of therapy including chemotherapy, radiation, small molecule and gene based therapies. However, resistance to such therapies is a common problem causing morbidity and mortality. In breast cancer, estrogen receptor positive (ER+) cancers are treated with Tamoxifen as endocrine therapy, a receptor antagonist that mimics estrogen but induces antiestrogenic effect. Xue et al. [62] showed that ER negatively regulated HOTAIR by binding to its promoter. They also showed that HOTAIR expression was high in tamoxifen resistant MCF-7 cells and the depletion of HOTAIR decreased tamoxifen resistance. Recent studies exploring long range chromatin interactions identified an alternate promoter and TSS for HOTAIR that was negatively regulated by estrogen but positively by forkhead transcription factors FOXA1 and FOXM1 [39]. HOTAIR expression was directly correlated with FOXM1. Combined analysis of FOXM1 and HOTAIR provided better stratification between patients that were responders and non-responders to endocrine therapy. FOXA1 is known to play a key role in ER activity, promote early to late stage progression of breast cancer and associate with poor prognosis, drug resistance and poor outcome [46]. Since FOXA1 regulates HOTAIR, one of its modes of action in promoting breast cancer may be through induction of HOTAIR. Next these authors showed that in HER2+ tumors, HOTAIR, FOXM1 and FOXA1 appear to co-express enhancing each other’s capability of predicting response to therapy. Overall, expression levels of HOTAIR in cancer appear to correlate with patient outcomes.

Some drugs routinely used in cancer therapy seem to alter HOTAIR expression either directly or indirectly. Triple negative breast cancers (TNBCs) that do not expresses ER, PR and HER2 are very aggressive in nature and express high levels of HOTAIR RNA [52]. In addition, they also express high levels of receptor tyrosine kinases (RTKs) like epithelial growth factor receptor (EGFR) and c-Abl, a protooncogene which is a non-receptor tyrosine kinase [16, 32]. Lapatinib is a RTKI (receptor tyrosine kinase inhibitor) that targets EGFR, but has been less effective in treating TNBCs [61]. In such tumors, dual targeting of c-Abl by Imatinib and EGFR by Lapatinib, was much more effective as it suppressed tumor growth by inhibiting the activity of β-catenin [61]. The β-catenin pathway alleviates repression of HOTAIR gene and upregulates it. In dual treated cells therefore, HOTAIR was downregulated and its loss may be responsible for lack of tumor growth [61]. Other naturally occurring dietary phytoestrogens like Calycosin and genistein possess antitumor effect, and treatment of breast cancer cells with these compounds suppressed HOTAIR expression [13]. These data support the idea that decreasing HOTAIR may significantly impair cancer growth and more such agents need to be identified.

In other cancers, high expression of HOTAIR was correlated with chemoresistance and chemo-insensitivity to platinum based chemotherapy [35, 40, 54, 66]. Mechanisms involved in these processes need to be studied further. Therefore, HOTAIR is not only associated with cancer progression but also participates in titrating response to various types of cancer therapy. Thus HOTAIR is not only an attractive prognostic but probably a good predictive biomarker in cancer.

Polymorphisms in HOTAIR and risk of disease

Since elevated HOTAIR overexpression has been associated with poor clinical outcomes and response to therapy, investigating genotype alterations that may affect its expression and hence functionality, is of paramount significance. Several studies have investigated genetic variants in HOTAIR and their association with increased risk of developing cancer but provide contradictory results. Mainly, SNPs (single nucleotide polymorphisms) have been looked at by haplotype-tagging in case–control studies (Table 3). Polymorphisms at HOTAIR locus were found mostly in the regulatory and intronic regions and only rs874945 is in proximity with 3′ regulatory region of the HOTAIR gene [75]. SNPs were associated with both increased and decreased risk of cancer. Both regulatory and intronic alleles may function as transcriptional regulators altering HOTAIR transcription in cancer cells. Only one exonic SNP rs7958904 has been reported that is predicted to alter HOTAIR secondary structure and hence modulate its interaction with partner proteins. [63]. Recently, Tian et al. [56] performed meta-analysis of all published studies and concluded that rs920778 and rs874945 increased and rs7958904 decreased cancer risk. While most studies refer to germline susceptibility and cancer risk, it is a disease with sporadic, multiple single base somatic substitutions that may alter expression of HOTAIR in cancer. As databases like The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) assemble whole genome sequences, more and more such alterations will become evident. For example, a scan of potential single base changes in the HOTAIR gene locus from various cancers in these databases showed many somatic mutations in the upstream region that possibly may alter transcription factor binding. Specifically Melanoma samples showed at least 38 substitutions deemed harmful by prediction algorithms (personal observation) and hence impact cancer progression, metastasis and death. However, more functional work to demonstrate the importance of such predictions is necessary.

Table 3 HOTAIR polymorphisms and risk of cancer

Regulation of HOTAIR RNA

As discussed above, the expression of HOTAIR RNA is strongly correlated with cancer progression, cancer metastasis and patient death [67]. Loss of HOTAIR expression in experimental systems decreased cancer growth and cell invasiveness [61]. Thus identification of regulatory mechanisms and transcription factors that interact directly with HOTAIR promoter, intronic sites as well as distal regions to perturb HOTAIR levels is necessary to find more ways to reduce its expression in cancer cells. Factors known to regulate HOTAIR are summarized in Table 4 and their mechanism is described below. HOTAIR is regulated by hormones, extracellular matrix proteins as well as other transcription factors.

Table 4 Regulation of HOTAIR expression

Hormonal regulation

Estrogen (17β-estradiol, E2) regulated HOTAIR expression in a dose-dependent manner and this induction was suppressed upon tamoxifen treatment in ER positive MCF-7 breast cancer cells raising the possibility that it is an estrogen responsive promoter [6]. Classically, upon activation by E2, ERα and/or ERβ display sequence specific binding to estrogen-response-elements (EREs) and 4 such EREs were found in the HOTAIR proximal promoter. Two of them, imperfect but complete, located at −1486 nt and −1721 bp upstream of HOTAIR transcriptional start site (TSS) drove ER mediated HOTAIR expression. Other estrogenic compounds such as Bisphenol A (BPA) and diethylstilbestrol (DES) transcriptionally induced HOTAIR promoter in a dose-dependent manner in breast cancer cells MCF-7 via ERα and ERβ [5]. Paradoxically, Xue et al. [62] showed that treatment of MCF-7/T47D cells with estradiol decreased HOTAIR but the anti-estrogen Tamoxifen induced its expression via a distal ERE located 14.5 kb upstream of HOTAIR TSS. Therefore, multiple EREs potentially regulate HOTAIR and the choice of ERE governs the finite amount of HOTAIR expression in breast cancer cells. In prostate cancer cells on the other hand, Zhang et al. [70] showed that androgens suppress HOTAIR transcription by driving AR binding to androgen responsive element (ARE) almost 46 kb upstream of the HOTAIR TSS. These studies suggest that hormones both suppress and drive HOTAIR expression in a context-specific manner.

Regulation by extracellular proteins

Apart from nuclear hormone receptors, HOTAIR expression was also regulated by extracellular proteins such as Osteopontin (OPN). Interferon regulatory factor 1 (IRF1) binds to HOTAIR promoter at IRF1 binding motifs between positions −65 to −53 bp, and −148 to −136 bp to suppress its expression. Yang et al. [65] showed that OPN induced HOTAIR expression in a dose-dependent manner by relieving this IRF-1 mediated suppression. Type-I collagen (Col-1), a component of extracellular matrix (ECM), transcriptionally induced HOTAIR expression in lung cancer cells. The region within 1 kb upstream of HOTAIR TSS was necessary and sufficient for this induction [78]. These data suggest that cell–cell interactions and ECM components also regulate HOTAIR expression.

Regulation by other transcription factors

HOTAIR is regulated by direct binding of c-Myc, an oncoprotein deregulated in various cancers [44]. c-Myc upregulated HOTAIR expression in gall bladder cancer cells through an E-box element located ~1053 bp upstream of HOTAIR TSS [37]. In ovarian cancer cells, various DNA damaging agents such as mitomycin C, hydrogen peroxide and platinum induced NF-κB that in turn increased HOTAIR expression via a p65-NF-κB binding site located around (−915 to −906 bp) upstream of the HOTAIR TSS [40]. TCF4/LEF bound the region (−256 to −249 bp) upstream of HOTAIR TSS and increased its expression via the β-catenin/Wnt signaling pathway [61]. HOTAIR is also a hypoxia inducible factor 1 alpha (HIF-1α)-inducible lncRNA. HOTAIR is induced by HIF1α binding on HRE sites in HOTAIR promoter [76]. Not only transcription factors but epigenetic modulators like BET ((bromodomain and extra terminal domain) family member bromodomain-containing protein 4 (BRD4) protein upregulated HOTAIR expression in glioblastoma cells by localizing to its binding site ~1 kb from HOTAIR TSS [42].

While the molecular effects have been researched for a long time, more recent studies are slowly unraveling the complexity of HOTAIR regulation. Since HOTAIR is associated with several malignancies, its known regulators form a group of potential targets for inhibition and ‘druggable’ targets in cancer.

Targeting HOTAIR for cancer therapy

Given the vast impact of high HOTAIR expression on cancer progression, treatment and patient prognosis, HOTAIR is a lucrative therapeutic target. HOTAIR binds specific RNAs, interacts with various proteins and is regulated by multiple transcription factors. Therefore, potentially, HOTAIR expression could be inhibited in various ways using (1) miRNAs that act as HOTAIR ceRNAs or sponges (2) using small molecule inhibitors against interacting proteins that disallow complex formation with HOTAIR, for example, inhibitors to EZH2 since most oncogenic effects of HOTAIR are mediated by binding to this protein [26] (3) using small molecule inhibitors to transcription factors and hormone receptors that induce HOTAIR (4) activating repressors that down regulate HOTAIR transcription (5) using RTKIs to decrease HOTAIR expression as discussed earlier [13]. However, since these proteins and miRNAs may have multiple functions in normal cells too, their inhibition may be rife with side effects and toxicities.

Since HOTAIR is a naturally antisense transcript (NAT), designing HOTAIR specific single stranded oligonucleotides called HOTAIR antagoNATs that bind HOTAIR and degrade it by RNAse H activity may be a more suitable way of neutralizing its deleterious effects in cancer cells [59]. Although many oligos need to be evaluated before use and rational design of such antagonats has not been achieved especially for in vivo use. Use of RNA inhibitors faces challenges due to RNA secondary structure, and in systemic delivery, uptake by target cells and in specificity of action. Nevertheless, design of lncRNA inhibitors offer several advantages since they regulate multiple genes by chromatin modulation. Most importantly, lncRNAs including HOTAIR are low abundance transcripts in normal cells that increase several fold in disease. Thus targeting lncRNAs is expected to yield lower side effects and toxicities and such efforts are ongoing.