Introduction

Protein-coding genes constitute a small part of most eukaryotic genomes. In humans, only 1,2% of DNA encode proteins. The other large part of the genome is referred to as “junk DNA”, except some housekeeping RNA molecules and regulatory and structural elements. Initially, the term “junk DNA” was applied by Ohno for pseudogenes, and subsequently, this term was extended for noncoding DNA without known function [1, 2]. The development of massive parallel sequencing technologies allows us to analyse transcriptomes at a deeper level. The studies revealed that most of the metazoan genome is transcribed [3]. Two major classes of noncoding RNA (ncRNA) are defined based on their length: long ncRNA (lncRNA), which is more than 200 nt in length, and small ncRNA, which is less than 200 nt in length. lncRNAs are usually classified according to the location in the genome [4]. However, information about the location does not provide the information about lncRNA function. Small RNAs, in turn, are classified based on their function in the most cases. Deep sequencing studies of small RNA fraction and subsequent analyses have discovered a high number of small RNA types. Among these diverse RNAs, small ncRNAs, which are involved in the regulation of gene expression, are of particular interest. Distinct classes, including microRNAs (miRNAs), piwi-interacting RNAs (piRNAs), sno-derived RNAs (sdRNAs), endogenous small interfering RNAs (siRNAs), and endogenous small hairpin RNAs (shRNAs), are referred to as small regulatory ncRNAs (Fig. 1) [5, 6]. Some of these molecules, such as piRNAs, are expressed in specific cell types and involved in the regulation of transposon expression in germ cells [7]. Other molecules, such as miRNAs, regulate gene expression in various cell types and define cell identity.

Fig. 1
figure 1

Biogenesis pathways of miRNA and miRNA-like molecules. a Canonical miRNA are grouped into clusters, transcribed by RNA-pol II and then processed by Drosha and Dicer ribonucleases. b Endogenous shRNAs are transcribed as small hairpins. Their biogenesis is Drosha independent. c The mirtron pathway. The miRNAs are encoded into short introns, spliced and debranched. Ldbr – lariat debranching enzyme. d snoRNA-derived pathway. e Endogenous siRNAs are formed by the transcription of inverted repeats. In mouse ESCs, siRNAs are predominantly formed from B1/Alu SINE elements [6]. f tRNA can form alternative hairpin structures that are cleaved by Dicer

Increasing evidence suggests that ncRNAs play important roles in different biological processes during development. Embryonic development starts from the totipotent zygote, which divides to form a blastocyst. The inner cell mass of the blastocyst consists of pluripotent stem cells (PSCs) that can self-renew and differentiate into any cell type in an organism. These embryonic stem cells (ESCs) can be isolated and indefinitely maintained under proper culture conditions [8, 9]. Another type of PSCs is reprogrammed from somatic cells by the overexpression of certain transcription factors (Oct4, Sox2, Klf-4, c-Myc) and referred to as induced pluripotent stem cells (iPSCs) [10]. PSCs have great potential for regenerative medicine and genetic therapy, especially iPSCs, which can be derived from the patient. Research in the field of pluripotency and self-renewal regulation is required for the efficient derivation and cultivation of PSCs. Significant progress has been made in the investigation of regulatory mechanisms using transcriptomic, epigenomic, proteomic, and metabolomic approaches. An important regulatory role was also established for many ncRNAs. In this review, we highlighted recent advances in the field of pluripotency and reprogramming regulation by ncRNAs.

MicroRNAs

miRNAs in Pluripotent Stem Cells

The most studied class of small ncRNAs is miRNA. Canonical miRNAs are 20–23 nt in length and transcribed by RNA-polymerase II into long pri-miRNA. RNase III Drosha with Dgcr8 forms a Microprocessor complex that cleaves pri-miRNAs to short hairpins termed pre-miRNAs [11]. These pre-miRNAs are exported to the cytoplasm and processed by RNase III Dicer to an RNA duplex. Then, the duplex is bound by Ago protein to form a RNA-induced silencing complex (RISC). One strand of this miRNA duplex is degraded, while the other strand functions as a guide for binding to the target mRNA. The biogenesis of miRNAs was excellently described in several previous reviews [12, 13]. miRNAs play important roles in the regulation of cell state and different processes, including self-renewal, differentiation, and reprogramming. Knockout or knockdown of miRNA processing machine components in human and mouse ESCs results in proliferation and differentiation defects [14,15,16,17]. Expression analysis revealed that the majority of miRNAs in mouse ESCs are related to six genomic loci [18]. The most abundant of these molecules are miRNAs from miR-290-295 (miR-371-373 in humans) and miR-17-92 clusters. Members of the miR-290-295 cluster contribute to approximately 70% of all miRNAs expressed in ESCs [19, 20]. These miRNAs predominantly share the same seed sequence AAGUGC and are named ESC-specific cell cycle-regulating (ESCC) miRNAs [17]. ESCC miRNAs are involved in the pluripotency regulation network, and their promoters are occupied by core pluripotency factors [20]. Core pluripotency factors also occupy the promoters of differentiation-related miRNAs to repress their expression and prevent differentiation. For example, miR-145 represses the pluripotency of human ESCs by directly targeting OCT4, SOX2 and KLF4 mRNAs, whereas the expression of miR-145 is repressed by OCT4 [21]. These links represent a double-negative feedback loop of pluripotency regulation. miRNAs from the miR-302-367 cluster share the same seed sequence as ESCC miRNAs, and this cluster is highly expressed in the mouse epiblast stem cells (EpiSCs) and human ESCs [22]. The molecular features of human ESCs are similar to those of mouse EpiSCs [23]. Another miRNA cluster with the AAGUGC seed sequence is located on human chromosome 19 and includes miRNAs of the miR-519/520 series. This cluster is primate-specific and expressed in human ESCs, placenta, and cancer cells [24,25,26,27].

Nucleotides 2–8 of mature miRNA represent a seed sequence that determines binding to the mRNA target [28]. Interestingly, different miRNA clusters with the same seed sequence characterize different cell types [29]. These miRNA clusters presumably may realize the same functions. However, this fact may be explained by various types of miRNA-mRNA interactions, except classic seed base pairing, which have previously been detected [30]. Additional base pairing of the 3′ miRNA end with target mRNA leads to the specification of targets of different miRNAs with the same seed [31]. These modes of miRNA actions may be responsible for different targets of miRNAs from miR-290-295 and miR-302-367 clusters. Currently, ESCC miRNAs have been implicated in the regulation of numerous processes, including the cell cycle, apoptosis, cellular metabolism, DNA methylation, and polycomb-mediated silencing of differentiation genes [17, 32,33,34,35].

In addition to mouse and human ESCs, miRNA expression was analysed in rat ESCs and iPSCs [36], pig iPSCs [37], and rabbit ES-like cells [38]. Rat ESCs and iPSCs are characterized by miR-290-295 cluster expression, and the overall expression pattern was similar to that of mouse ESCs. Pig miRNA expression was analysed in two types of iPSCs that differ by culture conditions. One iPSC type was cultured under conditions suitable for mouse ESCs, and the other iPSC type was cultured under human ESCs conditions. The miRNA expression pattern of the two types of pig iPSCs differs. However, both iPSC types express the miR-302 cluster. This miRNA cluster is also expressed in Rabbit ES-like cells, which are more similar to human than to mouse ESCs.

miRNA expression in mouse ESCs depends on the culture conditions. ESCs cultured with inhibitors of MEK1/2 (PD0325901) and GSK3 (CHIR99021) kinases (2i ES cells) represent a ground pluripotency state in contrast to serum-cultured ESCs [39,40,41,42]. ESCC miRNAs are expressed at similar levels in 2i and serum-cultured ESCs [43]. Notably, some differentiation-inducing miRNAs, such as miR-26a-5p, miR-99-5p, miR-218-5p and let-7 family miRNA, are upregulated in 2i ESCs. However, these miRNAs cannot silence self-renewal without the addition of serum. A comprehensive bioinformatics analysis of the miRNA-mRNA interaction network in serum-cultured ESCs revealed that one of the primary miRNA functions is the repression of differentiation signals [19, 44]. The inhibitors PD0325901 and CHIR99021 influence the maturation of miRNAs through the inhibition of Microprocessor complex [45, 46]. Particularly, the inhibition of GSK3 kinase induces the cytoplasmic localization of Drosha, and the inhibition of MEK1/2 decreases the stability of Dgcr8. In addition to miRNA processing, Dgcr8 is also involved in the splicing of Tcfl7 mRNA [47]. Tcfl7 is required for the activation of lineage-specific differentiation programmes. Despite impaired miRNA processing, the lack of differentiation signals from serum and the inhibition of MEK1/2 presumably lead to stable ground state ESC culture.

miRNAs and Reprogramming

The classic reprogramming factors are Oct4, Sox2, Klf4, and c-Myc [10]. Different combinations of these and other pluripotency factors are used to obtain iPSCs [48]. Somatic cells can also be reprogrammed to a pluripotent state using small molecules [49]. The search for additional methods to obtain iPSC has not stopped. The first evidence of miRNA-based reprogramming was reported in 2008. Two human cancer cell lines were reprogrammed to an ES-like state using the expression of miR-302-367 cluster [50]. Subsequently, this miRNA cluster was used to reprogramme human and mouse fibroblasts to a pluripotent state with higher efficiency compared to Yamanaka factors [51]. Another group used miR-200c, members of the miR-302-367 cluster, and miR-369 mature miRNA mimics to reprogramme human and mouse adipose stromal cells and human dermal fibroblasts [52]. The advantage of the latter approach is the absence of lenti- or retroviral vectors that integrate into the genome. miRNA can also be used in combination with Yamanaka factors to increase the reprogramming efficiency and obtain a homogeneous population of iPSCs. The miRNAs miR-291, miR-294, and miR-295 substitute c-Myc in the reprogramming of mouse cells [53]. Moreover, the efficiency of such an approach is higher than that of the classic approach. Moreover, uniform populations of iPSCs are formed. Reprogramming to a pluripotent state is a dynamic process accompanying changes in gene expression, including ncRNAs, proteome, epigenome, and metabolome [54,55,56,57,58]. ESCC miRNAs and other miRNAs with the AAGUGC seed sequence increase reprogramming efficiency through different mechanisms, such as the regulation of the cell cycle, epigenetics, transcription factors, metabolic pathways, vesicular transport and the mesenchymal-to-epithelial transition (MET) [59,60,61,62]. MET is a key process of the initiation stage of reprogramming [63]. The inhibition of its reverse process, the epithelial-to-mesenchymal transition (EMT), enhances the reprogramming of somatic cells [64]. Various miRNAs promote MET by targeting different genes involved in this process. Some miRNAs, such as miR-106b, miR-93, members of the miR-302-367 cluster, and miR-372, inhibit EMT and promote MET by targeting TGFBR2 [60,61,62]. Members of the miR-200 family have been shown to target ZEB transcription factors, which are significant modulators of EMT [63, 65,66,67]. miRNAs belonging to the miR-181 family stimulate the reprogramming of mouse embryonic fibroblasts at the early stages by promoting the initiation phase [68]. This effect is achieved by the activation of Wnt and the repression of TGF-β signalling pathways.

The regulation of core pluripotency factors is a crucial process that influences the reprogramming efficiency. For example, transcription factor NR2F2 negatively regulates OCT4 expression in human cells, but during reprogramming, this factor is downregulated by miR-302 [69]. The miR-34 family of miRNAs provides a barrier for reprogramming by the repression of Nanog, Sox2, and Mycn expression [70]. The modulation of cellular physiology also affects reprogramming [71]. Cells change metabolism from oxidative phosphorylation to glycolysis during reprogramming [57, 72, 73]. Pyruvate kinase Pkm2 is involved in glycolysis and is highly expressed in pluripotent cells [33]. The expression of Pkm2 is regulated by the miR-290-295 cluster, which activates the Mbd2-Myc-Pkm2 axis through the downregulation of Mbd2. miR-369 stabilizes the translation of the splicing factor HNRNPA2B1, which is required for Pkm2 expression [74]. miR-31 plays a significant role in altering mitochondrial function by targeting succinate dehydrogenase complex subunit A [75].

A number of known miRNAs have been shown to mediate reprogramming. For example, miR-29b, miR-138, miR-19a/b, and miR-6539 increase reprogramming efficiency [76,77,78,79]. miR-34a, miR-29a, miR-21, let-7 family, miR-212, miR-132, miR-145, miR-27a, miR-24, miR-134 inhibit reprogramming [80,81,82,83,84,85]. We summarized the miRNAs implicated in reprogramming in the Table 1. However, iPSCs can be derived from Dgrc8-deficient mouse fibroblasts and neural stem cells, although with decreased efficiency compared to wild-type [86]. This finding suggests that canonical miRNAs are dispensable for reprogramming. Other small ncRNA or non-canonical miRNA species, such as mirtrons, for example, may compensate for the absence of canonical miRNAs. The miRNA-like molecules may be generated through different pathways, some of which are Drosha or Dicer independent [6, 87]. Further studies of individual miRNAs and small ncRNAs with miRNA-like functions may shed light on this problem.

Table 1 miRNAs implicated in the regulation of reprogramming to pluripotent state

IsomiRs

The pre-miRNA hairpins can be processed with some alterations, resulting in the addition or deletion of several nucleotides on the 5′ or 3′ ends of mature miRNA [88]. miRNAs may also be subjected to posttranscriptional modification by A-to-I editing [89]. A-to-I editing is performed by adenosine deaminases acting on RNA (ADAR) [90]. Inosines are recognized as guanosines, and the change appears as A-to-G. These mature molecules with 5′ or 3′ shifts or post-transcriptional edits constitute a minor fraction of miRNAs, called isomiRs. Evidence suggests that isomiRs are functional and important for the evolution of miRNAs [91, 92]. miRNA and corresponding isomiRs regulate genes that are enriched in the same functional pathway [93]. However, changes in the 5′ end of the miRNA may lead to a different pool of mRNA targets. For example, miR-302a-5p, which is expressed in human ESCs, has isomiR miR-302a-5p (+ 3) with three distinct 5′ end nucleotides shifted to the 3′ end [94, 95]. OTX2 is a miR-302a-5p target, whereas miR-302a-5p (+ 3) targets TSC1 expression and does not regulate OTX2. A total of 19 of the 24 pluripotency-associated miRNAs from miR-290-295 and miR-302-367 clusters are processed in different isomiRs in mouse ESCs [96]. These findings raise questions regarding which of the produced isomiRs are actually functional and how the pool of targets differs compared to classic sequences. Further studies may shed light on the answers to these questions and elucidate isomiR functions in PSCs.

MicroRNA-offset-RNAs

Another type of small RNA molecules produced from pre-miRNA hairpins is microRNA-offset-RNA (moRNA). moRNAs are located adjacent to mature miRNAs in conserved regions across species. First, moRNAs was identified in human small RNA sequencing data in 2009 [97]. Further, 326 moRNAs were found in human ESCs, whereas a significantly lower number of moRNAs were found in fibroblasts [98]. Transfection of the moRNA-103a-2-3p mimic results in the downregulation of 538 genes, and a substantial part of the genes have seed matches in the 3′UTR of moRNA-103a-2-3p. Another study showed that moRNA-21 is functional, and its function depends on the seed sequence and requires Ago2 for gene repression [99]. These studies suggest that this type of small ncRNA regulates gene expression through a miRNA-like pathway.

snoRNAs and sno-derived RNAs

snoRNAs are small ncRNAs of 60–140 nt in length. Two classes of snoRNAs exist: H/ACA box and C/D box. snoRNAs function as guides for ribosomal RNA modifications. H/ACA box snoRNAs promote 2′-O-ribose methylation in complex with fibrillarin, and C/D box snoRNAs promote pseudouridylation in complex with dyskerin [100]. Fibrillarin is essential for early development, and homozygous mutations lead to death prior to implantation [101]. A decreased level of fibrillarin affects the expression of snoRNAs encoded in introns. Indeed, the disruption of one snoRNA may lead to significant changes in cellular homeostasis. For example, the inhibition of H/ACA box snoRNA 7A by antisense oligonucleotide suppresses the proliferation and self-renewal of umbilical cord blood-derived mesenchymal stem cells [102]. Dyskerin ribonucleoprotein complex (DKC1) is involved in the transcriptional regulation of core pluripotency genes as an OCT4/SOX2 coactivator [103]. Several snoRNAs are differentially expressed between mouse ESCs and their differentiated derivatives [104]. ESC snoRNAs may guide DKC1 to gene enhancers, and the disruption of the complex presumably leads to changes in the transcriptional gene network. Additional studies will help to understand the functions of snoRNAs in self-renewal and pluripotency regulation.

In addition to isomiRs and moRNAs, which are produced from the pre-miRNA hairpin, functional miRNA-like molecules can be derived from snoRNAs [105,106,107]. These small ncRNAs are called sno-derived RNAs (sdRNAs). snoRNAs can be processed into sdRNAs that perform posttranscriptional gene silencing. However, the function of sdRNAs in pluripotency has not yet been investigated, and future studies are required to analyse the sdRNA expression in PSCs and identify pathways that are regulated by this type of small ncRNAs.

lncRNAs

lncRNAs Expression and Function in PSCs

Another enormous class of noncoding RNAs is lncRNAs. lncRNAs are more than 200 nt in length, transcribed by RNA pol II, polyadenylated, capped, and often spliced from pre-lncRNA. lncRNAs are involved in gene expression regulation in different biological processes in embryonic development [108]. Progress in massively parallel sequencing technologies enabled the identification of thousands of lncRNAs. Guttman et al. identified approximately 1600 large intergenic ncRNAs (lincRNAs) by analysis of ChIP-seq data of trimethylated lysines 4 and 36 of histone H3 in four cell types [109]. Further, this group developed the algorithm of transcriptome reconstruction, Scripture, and identified 1140 novel lincRNAs using RNA-seq data [110]. Approximately 1500 very large intergenic ncRNAs (vlincRNAs) were identified in human cells [111]. To date, more than 14,000 human lncRNAs transcripts were manually annotated by the GENCODE consortium [112].

A loss-of-function study revealed that 137 lincRNAs are involved in the transcriptional network of mouse ESCs, and 26 of these molecules regulate the maintenance of the pluripotent state [113]. In turn, the expression of these lincRNAs is regulated by core pluripotency factors Oct4, Sox2, Nanog, c-Myc, and Klf4. In another study, 20 lincRNAs were functionally verified as pluripotency maintaining players [114]. Some lncRNA molecules expressed in ESCs are required for the repression of differentiation-related and lineage-specific genes [113, 115,116,117,118,119]. Knockdown of such lncRNAs is accompanied by the loss of pluripotency state and activation of lineage-specific markers. For example, the knockdown of AK028326 (Gomafu/Miat) results in the upregulation of trophoblast-specific transcripts [117]. Loss of Panct1 activates expression of endoderm and ectoderm markers [115]. In contrast to miRNAs, lncRNAs can regulate gene expression at the transcriptional level. The inactivation of the entire X-chromosome is initiated by the well-studied lincRNA Xist, which interacts with polycomb repressive complex 2 (PRC2) [120, 121]. The hypothesis that other lincRNAs can represent a scaffold for chromatin modifying enzymes was confirmed by a number of studies. A significant part of human lincRNAs in various cell types is physically associated with the PRC2 complex [122]. Further, the immunoprecipitation of RNA–protein complexes showed that ESC lincRNAs interact with chromatin complexes, which are involved in the “reading”, “writing”, and “erasing” of epigenetic marks [113]. Some examples of lincRNAs function in ESCs were investigated in the details. tsRMST lncRNA interacts with the pluripotency factor NANOG and the component of PRC2 repressive complex SUZ12. This interaction leads to the repression of differentiation-related transcription factors and non-canonical Wnt ligand WNT5A [118, 119]. lncRNA-ES1 and lncRNA-ES3 contribute to the suppression of differentiation through interactions with SOX2 and SUZ12 [116]. A number of lncRNAs found in ESCs, bind to WDR5 [123]. WDR5 is a subunit of the MLL complex, which implements histone H3 lysine 4 trimethylation and promotes a pluripotency state.

At the post-transcriptional level, lncRNAs can act as a competitive endogenous RNA or “sponges” for miRNAs to block its function. In mouse ESCs, lncRNA AK048794 binds to miR-592 and negatively modulates the expression of Oct4, Sox2 and Nanog through targeting FAM91A1 mRNA [124]. lincRNA-ROR (Regulator of Reprogramming) promotes pluripotency by binding to miR-145, the negative regulator of OCT4, SOX2, KLF4, and NANOG expression [21, 125]. Human lincRNA HPAT5 modulates the expression of the let-7 miRNA family and prevents differentiation [126].

The term ncRNA implies that RNA molecules do not encode proteins. Nevertheless, a great number of small open reading frames (ORF) were found in lncRNAs using a bioinformatics approach [127]. Verification of the translation and function of the small ORF is a difficult process. The high-throughput analysis of ribosome-protected RNA fragments (ribosome profiling or Ribo-seq) will help to identify novel and confirm the translation of some predicted small regulatory peptides encoded by lncRNAs [128]. In mouse ESCs, approximately half of lncRNAs display ribosome profiling signals [129]. Additional studies suggest that the part of lncRNAs are actually translated [130], and other studies have shown that the majority of lncRNAs do not have coding potential [131, 132]. In human cancer cells, 510 of the 1189 expressed lncRNAs have ORFs according to ribosome profiling analysis, yet their translated peptides are likely not functional [133]. Nevertheless, several studies have demonstrated the translation of small polypeptides from lncRNAs and analysed their functions [134, 135]. One of these small polypeptides is CRNDEP, which is translated from human lncRNA CRNDE in highly proliferative tissues. This small peptide is presumably involved in the regulation of cell proliferation [135]. Notably, the mouse orthologue of CRNDE linc1399 is also associated with the maintenance of a pluripotent state [113, 135]. Evolutionary conservation supports the hypothesis of the functionality of this lncRNA-encoded polypeptide. Thus, combining transcriptome, translatome, and proteomic studies may reveal lncRNAs that encode proteins, and shed light on the question of ncRNA translation.

lncRNAs and Reprogramming

The involvement of the lncRNAs in the reprogramming process has been established in several studies. The expression of lncRNAs was analysed during the reprogramming of mouse fibroblasts to a pluripotent state [136, 137]. Approximately 1200 lncRNAs were differentially expressed at different stages during the process [136]. Another analysis showed that approximately 300 lncRNAs were activated in transitional cells and/or iPSCs, and some of these molecules are involved in the suppression of lineage-specific genes and the regulation of the metabolic gene expression [137]. However, the first functional example of lncRNA involved in the reprogramming process, called lincRNA-ROR, was previously established [138]. Knockdown of lincRNA-ROR decreased reprogramming efficiency, and its overexpression resulted in increased iPSC colony formation. One of the established functions of lincRNA-ROR is the inhibition of the p53-mediated cell cycle arrest and apoptosis [139]. iPSCs show properties similar to ESCs. Fully pluripotent iPSC lines can contribute to the development of tetraploid blastocysts and generation of full-term mice [140]. However, some iPSC clones failed to pass this test. RNA-seq analysis of genetically identical iPSC and ESC lines revealed that the aberrant silencing of the lncRNAs Gtl2 and Rian distinguishes iPSC clones from ESCs and restricts their developmental potential [141]. Notably, the expression of a few transcripts located in the imprinted Dlk1-Dio3 cluster determines cell potential.

A substantial amount of human lincRNAs is enriched in transposable elements (TE), particularly endogenous retroviruses (ERVs) [142]. Distinct classes of ERVs are expressed at different stages in human preimplantation embryos [143]. HERVH was established as the most highly expressed class of endogenous retroviruses in human PSCs [144]. Further, the expression of HERVH was linked to a naïve pluripotency state [145]. Currently, the functions of many individual TE-derived lincRNAs remain elusive due to their repetitive structures. Nevertheless, recently, three human lincRNAs were functionally studied. HPAT2, 3, and 5 were implicated in the pluripotency regulation network and reprogramming of human fibroblasts [126]. We summarized the lncRNAs implicated in pluripotency regulation and reprogramming in the Table 2.

Table 2 lncRNAs implicated in the regulation of pluripotency and reprogramming

Genome-Editing Technologies for Studying ncRNA

Thousands of ncRNAs that belongs to different classes have been annotated to date, but few of these molecules have been functionally studied, particularly in the context of pluripotency. In the case of miRNA and miRNA-like molecules, the crucial step is to find target mRNAs. A large number of the ESCC miRNA targets have been established to date. However, additional studies are required to elucidate the entire pool of the ESCC miRNAs targets, discriminate the functions of different miRNAs with similar sequences, and find targets of other miRNAs expressed in ESCs and involved in reprogramming. One miRNA can regulate hundreds of genes, thus identifying target genes and conducting functional analyses are challenging [28]. The existing computational prediction algorithms generate many false positive and false-negative results [146]. Large-scale studies using HITS-CLIP or PAR-CLIP approaches with the overexpression of certain miRNAs will help to identify targets of non-abundant miRNAs [147, 148]. However, miRNA overexpression may introduce biases in the results. To confirm miRNA-mRNA interactions, luciferase reporter constructs are typically used [149]. The main drawback of this method is the non-physiological conditions caused by transfection of exogenous reporters and miRNA mimics. In addition, this approach is often performed in a heterogeneous system, such as HEK293 cells, which are easy to transfect. These drawbacks limit the use of the luciferase approach to understand the function of this miRNA binding site or not in certain cell types. Recently, a genome-editing approach was utilized to analyse the functionality of miRNA target sites [150]. Candidate target sites can be replaced with molecular barcodes using homology-directed repair induced by the CRISPR/Cas9 system. Then, the effect of the microRNA target site mutation can be analysed by quantitative PCR.

Genetic knockdown or knockout approaches may be utilized to understand miRNA functions. The widespread method of miRNA knockdown is the delivery of antisense inhibitors. miRNA antisense inhibitors may be chemically synthesized or expressed from transgenes, which contain tandem repeats of the miRNA complementary sequence [151, 152]. Using chemically synthesized inhibitors is expensive and suitable for short-term studies. The knockout of miRNA may be realized using genome-editing methods, such as TALENs or CRISPR/Cas9 systems [153,154,155]. The expression of miRNA may be disrupted by introducing indels in the processing sites, which impede miRNA biogenesis, or by deleting the region encoding target miRNA genes or whole clusters (Fig. 2a). The expression of Cas9 and two sgRNA results in the deletion of regions up to several megabases between sgRNA sites [156]. This approach allows the stable knockout of protein-coding genes and various ncRNAs, including lncRNAs, both in vitro and in vivo (Fig. 2b). For example, a locus of approximately 6,5 kb, encoding the HPAT5 lincRNA, was deleted using the CRISPR/Cas9 system to obtain a knockout cell line and investigate HPAT5 function in reprogramming [126]. Mice with a 23-kb deletion of the Rian lncRNA gene were obtained by the injection of Cas9 protein and two single guide RNAs into one-cell stage mouse embryos [157]. The CRISPR/Cas9 system was also used to study ncRNAs, such as miR-21, miR-29a, lncRNA-21A, UCA1, and AK023948, in human cancer cell lines [158]. Moreover, the method of ncRNA knockout was adapted for high-throughput screening with lentiviral paired-guide RNA library [159]. However, the removal of large regions in the genome has potential pitfalls. These regions may contain some regulatory elements, or genes of small ncRNAs, such as miRNAs, snoRNAs, etc. Additionally, the homozygous deletion of large regions remains challenging [155]. Another way to eliminate ncRNA expression is the removal of the promoter region, which is typically shorter than the entire lncRNA (Fig. 2c).

Fig. 2
figure 2

ncRNA knockout and knockdown using the CRISPR/Cas9 system. a miRNA cluster, single miRNA or other small RNA genes can be deleted using CRISPR/Cas9 with two sgRNAs, which flank the target region. b Knockout of lncRNAs can be achieved using the same approach. However, the deletion of a larger region is more difficult than the deletion of a smaller region. c Alternatively, promoters containing the transcription start site can be deleted to decrease lncRNA expression. d Catalytically inactive dCas9 fused with the KRAB repressor domain can be used to knockdown ncRNAs

Several methods have been used to avoid genetic alterations in the targeted region to obtain knockdown. For example, transcriptional repressors based on TALE proteins fused with the KRAB domain are utilized [154, 160]. The CRISPR interference system can also be used to decrease ncRNA expression. In this method, sgRNA targeting the 5′ region or gene promoter is coexpressed with catalytically inactive dCas9 protein [161]. The binding of this complex to the nontemplate DNA strand blocks the elongation of transcription or prevents initiation. To increase the efficiency of gene knockdown, dCas9 can be fused with the KRAB domain (Fig. 2d) [162].

Conclusion

PSCs are a unique model for studying early development, modelling hereditary diseases, and drug screening. PSCs differentiate into many cell types of the human body, which makes these cells indispensable for regenerative medicine. Knowledge of pluripotency regulation mechanisms is required for proper culture conditions and the efficient derivation of patient-specific iPSCs, which are suitable for subsequent differentiation and transplantation. High-throughput studies revealed that the pluripotency regulation network is complex, and ncRNAs play important roles in its maintenance. The first step is identifying ncRNA expression patterns. Large-scale loss-of-function studies are the next step after massive parallel sequencing to reveal substantial ncRNAs in pluripotency regulation. Finally, functional studies are required to understand the functions of individual ncRNAs. Currently, various classes of ncRNAs have been identified, and most part of these molecules are “dark matter” of the genome. Future studies will shed light on the world of ncRNAs and their roles in pluripotency regulation.