Introduction

Stem cell biology is an emerging field, and understanding the molecular mechanisms underlying the self-renewal and differentiation capabilities of stem cells is vital for their subsequent biomedical applications. Stem cells, specifically Embryonic Stem Cells (ESCs), are primitive cells virtually capable of differentiating into all the cell types of the three germ layers (ectoderm, mesoderm, and endoderm). Therefore, these cells have a high therapeutic value in various diseases and genetic conditions. ESCs can be utilized for disease modeling and other applications, but the same is not applicable for therapeutic purposes because of ethical concerns and allogenic transplantation concerns associated with it. Hence, researchers were looking for various alternative options, which eventually led to the concept of induced Pluripotent Stem Cells (iPSCs), where two independent groups reprogrammed human fibroblasts to their primitive ESC-like state using a specific set of transcription factors [1, 2]. These studies used partially different cocktails of transcription factors, namely OCT4, SOX2, KLF4, and c-MYC (abbreviated as OSKM; Yamanaka factors) [1, 3] and OCT4, SOX2, NANOG, and LIN28 (abbreviated as OSNL; Thomson factors) [2]. Since then, various stem cell-specific genes have been analyzed for their applicability and functionality in the reprogramming paradigm [4]. This will give us new insights and a better understanding of reprogramming and generating clinical-grade iPSCs more efficiently. Zinc and SCAN domain containing 4 (ZSCAN4) is one such factor that enhances reprogramming efficiency when included with the original cocktail. This gene is expressed exclusively in the two-cell stage, in in vitro cultured ESCs, and during early embryonic development, where its expression is essential for embryo implantation [5, 6]. In addition, the expression of ZSCAN4 is crucial for telomere maintenance in both ESCs and iPSCs [6, 7], which prevents these cells from entering into senescence and enables them to self-renew indefinitely. Moreover, the inclusion of ZSCAN4 during reprogramming resulted in iPSCs having better genomic stability [8, 9]. ZSCAN4 is also reported to have a role in maintaining the cancer stem cell phenotype and promote their tumorigenic potential [10]. Hence, understanding the role and functional capability of ZSCAN4 will further help us gain an in-depth understanding of the self-renewal and immortality of stem cells and cancer cells. Therefore, in this review, the role of ZSCAN4 in embryonic development, ESCs, iPSCs, and cancer reported by various studies to date is discussed in detail.

ZSCAN4 Gene

ZSCAN4 is a protein-coding gene (Table 1) and is present only in mammals [5]. The mouse genome comprises nine members of this gene (Zscan4a, Zscan4b, Zscan4c, Zscan4d, Zscan4e, Zscan4f, and three pseudogenes), whereas the human genome contains only one member. In mice, all nine members of the gene have multi-exonal organization and are firmly grouped in an 850 Kb region on chromosome 7. Three out of these nine genes are pseudogenes, namely Zscan4-ps1, Zscan4-ps2, and Zscan4-ps3. Members of the mouse Zscan4 gene, Zscan4c, Zscan4d, and Zscan4f encode full-length open reading frames (ORFs) with 506 amino acids (aa) having more than 94% similarity and are jointly called Zscan4, indicating these genes are likely to share the same function [5, 11]. In humans, ZSCAN4 is present on chromosome 19 [5, 11].

Table 1 Physicochemical parameters of human ZSCAN4 sequence

ZSCAN4 Protein

Human ZSCAN4 protein has a length of 433 amino acids. Analysis of this sequence using the InterPro database [12, 13] revealed that the protein consists of two characteristic domains. One of the two critical domains is the SCAN domain at the N-terminal region of the protein (InterPro Id: IPR003309), which is from 44–126 residues belonging to the SCAN domain superfamily (Fig. 1). SCAN acquired its name from the initials of four proteins that have this domain in common (SRF-ZBP, CTfin-51, AW-1, Number 18 cDNA) [14]. This domain is often referred to as the leucine-rich region [5, 11, 15]. In the case of mice, Zscan4c, Zscan4d, and Zscan4f are all 506 aa in length, and all of them have their SCAN domain from 37 to 119 residues. The other two members, Zscan4b and Zscan4e, are 505 aa in length. They have their SCAN domain from 53 to 86 residues. Not much information is available about Zscan4a in the databases. The SCAN domain is highly conserved in vertebrates with a specific motif involved in protein–protein interaction [11, 16]. Deleting this domain diminishes its property of self-association, indicating ZSCAN4 can form dimers and oligomers after self-interaction [17].

Fig. 1
figure 1

Pictorial representation of domains present in human ZSCAN4 protein. The SCAN domain is present towards the N-terminal region of the protein and four zinc finger domains present towards the C-terminal end of the protein

The second domain is the zinc finger domain (total of four in number; InterPro Id: IPR013087), present in the C-terminal region of the human ZSCAN4 protein. This domain belongs to the zinc finger C2H2 superfamily, shown in Fig. 1. In the case of mouse Zscan4c, Zscan4d, and Zscan4e, the first zinc finger motif is located at position 395–417 aa, second at 424–446 aa, third at 452–474 aa, and fourth at 480–503 aa. The C2H2 motifs of the zinc finger domain are primarily entailed in binding to a specific fragment of DNA and interacting with several other cellular factors to regulate the transcription of its target genes [11]. Each finger is a short stretch of approximately 20–30 aa comprising two conserved cysteines and histidines. The zinc ion part of the finger assists in proper folding to form compact structures. The structures comprising α-helix include the conserved histidines, whereas β-turn consists of the conserved cysteines [18]. Intriguingly, Zscan4f was found to have another isoform of 77 aa (UniProt ID: L7N266) [19]. These three members of the mouse Zscan4 family (Zscan4a, Zscan4b, and Zscan4e) code for truncated proteins expressed at the basal level in the late two-cell stage and ESCs [5]. However, the exact function of these truncated proteins remains elusive. The DNA binding activity of the zinc finger domain is highly specific, which can be utilized for designing zinc finger nucleases. These zinc finger nucleases can be used for genetic manipulation in various organisms [20, 21].

The three-dimensional and secondary structure of ZSCAN4 protein is not reported to date. Therefore, the secondary structure was predicted using online bioinformatics tools, namely GOR4 [22], SOPMA [23], and PSIPRED [24, 25]. The three-dimensional structure (Fig. 2A) was obtained from the Alphafold database [26, 27]. The prediction from all these servers revealed that the protein majorly consists of random coils and other structures with a significant contribution of α-helices and a small amount of β-sheets (Fig. 2B). The presence of random coils or intrinsically disordered regions in a transcription factor is a characteristic feature of transcription factors necessary for its functionality [28]. Moreover, analysis of the amino acid composition of ZSCAN4 protein reveals the presence of glutamic acid and aspartic acid in significantly larger amounts (Table 1), which might also be the reason for the dominance of random coils and other structures [29] in the predicted secondary structure of human ZSCAN4 protein.

Fig. 2
figure 2

The three-dimensional and secondary structure content of human ZSCAN4 protein. A. The three-dimensional structure of the human ZSCAN4 protein was obtained from Alphafold database and was visualized using BIOVIA discovery studio visualizer. B. The amount of different secondary structures present in human ZSCAN4 protein predicted using different web servers (GOR4, PSIPRED, SOPMA, Alphafold) are listed

To study the conservancy among ZSCAN4 protein from different species belonging to the Vertebrata sub-phylum, the orthologs of ZSCAN4 were obtained from the NCBI ortholog database. We found about 75 ZSCAN4 orthologs, which subsequently were aligned using NCBI Constraint-based Multiple Alignment Tool (COBALT) [30] multiple sequence alignment tool. Multiple sequence alignment data were visualized using the WebLogo online tool [31, 32] (Fig. 3). Interestingly the whole SCAN domain is not conserved. Instead, it can be noticed that a part of the SCAN domain at the beginning and a small part at the end of the domain is conserved among the orthologs of ZSCAN4 protein (indicated by a solid red line). Similarly, only certain parts of four zinc finger domains are conserved among the orthologs (indicated by a red dotted line) (Fig. 3).

Fig. 3
figure 3

Conservancy among the ZSCAN4 orthologs. The ZSCAN4 orthologs belonging to different species of vertebrata sub-phylum were obtained from NCBI ortholog database, aligned using NCBI's COBALT online tool and visualized using WebLogo online server. The height of each amino acid corresponds to its conservancy among the orthologs. The most conserved region in SCAN domain is indicated by a red solid line and in Zinc finger domains by a red dotted line

Further, a phylogenetic tree was also constructed by the neighbor-joining method with bootstrap 1000 iterations using Molecular Evolutionary Genetics Analysis (MEGA) version X software [33]. The formation of different clades represents evolutionary changes that took place among these species. The values at the nodes are bootstrap values, which show the probability of formation of that particular clade in terms of percentage. Figure 4 shows that most of the bootstrap values are well above 60, showing the high consistency of the tree. The tree was inferred with Homo sapiens as the reference point. It is evident that there are three distinct evolutionary clusters in the tree, denoted as A, B, and C (Fig. 4). Cluster A can be divided into three distinct groups denoted as A1, A2, and A3. The species in A1 belong to the order carnivora, A2 belong to bats, and A3 belong to ungulates. Similarly, cluster B can be divided into two groups, B1 and B2. B1 consists of species from order rodents and some primates, and B2 from order primates (apes and monkeys). Homo sapiens falls under group B2. The closest species to humans is the Gorilla gorilla. The ZSCAN4 protein family from mice falls under cluster C, which interestingly is entirely distinct and separated from the rest of the orthologs. In agreement, pairwise global alignment between ZSCAN4 proteins from mouse and humans reveals only 47% sequence similarity.

Fig. 4
figure 4

Phylogenetic relationship among the orthologs of ZSCAN4. The ZSCAN4 orthologs belonging to different species of vertebrata sub-phylum were obtained from NCBI ortholog database, aligned using NCBI's COBALT online tool. Consequently, the phylogenetic tree was constructed using MEGA X software using neighbour joining method with 1000 bootstrap iterations. The tree itself is grouped into 3 major clades depending on their class and order, having their own sub groups. The numbers at the node represent the bootstrap values

Although the function of the majority of SCAN-Zinc finger proteins is obscure, certain SCAN-Zinc finger proteins (ZNF274, Zfp369, and Zfp496) have been involved in the transcriptional regulation of specific growth factors (Nerve growth factor: Neurotrophin receptor-interacting factor) and some of the auxiliary genes (alpha2(XI) collagen gene, tyrosine hydroxylase gene, c-myb, myeloperoxidase, CD34, and LDOC1) associated with cell differentiation and survival [11]. Hence, ZSCAN4, one of the SCAN4-Zinc finger proteins, is speculated to have a role in cell survival and differentiation [11]. Moreover, it is reported that the half-life of ZSCAN4 protein is about 6 h in mouse ESCs [34] and about 8.3 h in cancer cells [35]. It is also reported that ZSCAN4 protein is degraded by proteasomal-mediated ubiquitination at lysine 48 instead of autophagy [35]. Since ZSCAN4 expression is tightly regulated, this is crucial for maintaining an optimal protein level inside the cell and clearance from the cell after its degradation [35].

In addition, the protein interacting partners of ZSCAN4 protein were analyzed using the STRINGv11 database [36], which is depicted in Fig. 5. A protein essential for two-cell stage formation [37], DUX4 was found to interact with ZSCAN4. It was reported that ZSCAN4, MBD3L2, and LEUTX are direct targets of DUX4 [38], and they were predicted to be interacting with one another. Moreover, DUX4 also regulates PRAMEF1 and TRIM43 [39], which were also predicted to be interacting with ZSCAN4. The mouse ESCs express ZSCAN4 as a transient burst from time to time, which is referred to as the Z4 event [5, 40]. The gene TMEM92 is one of the upregulated genes specifically during this Z4 event [40], which is predicted to interact with ZSCAN4. Furthermore, mass spectrometry data suggests that KDM1A, a chromatin repressing complex, interacts directly with ZSCAN4, regulating certain heterochromatin areas [40]. Apart from these proteins, experimental evidence indicates that ZSCAN4 closely interacts with shelterin complex proteins, namely Trf1 and Rap1, which are speculated to be working together with ZSCAN4 in regulating telomere length [41, 42].

Fig. 5
figure 5

Protein interacting partners of human ZSCAN4. The proteins that are interacting or predicted to be interacting with human ZSCAN4 protein were obtained from STRING v11 database

Role of ZSCAN4 in Embryonic Development and Germ Cells

ZSCAN4 is exclusively expressed at the late two-cell stage of preimplantation embryos [5]. Therefore, the Zscan4 transcripts are absent in oocytes, fertilized eggs, and other preimplantation stages, indicating its high specificity and comparatively short half-lives. Moreover, ZSCAN4 expression is not observed in blastocysts, such as inner cell mass (ICM) and early blastocyst outgrowth. Approximately 90% of transcripts of the two-cell stage of mouse embryos are Zscan4d. In contrast, Zhang and colleagues reported a spike in Zscan4 expression in the three-four cell stage of preimplantation [43]. In humans, ZSCAN4 is expressed at the six-eight cell stage of embryos concurrent with the major tide of zygotic genome activation (ZGA) [44].

In 2007, a study revealed that Zscan4 is expressed in ESCs [5]. The authors also showed that Zscan4 expression is absent in Trophoblast stem cells (TSCs), Neural stem/progenitor cells (NSCs), and Embryonal carcinoma cells (ECCs) such as P19 and F9 cells [5]. Interestingly, apart from ESCs, a study showed that ZSCAN4 is expressed in a tiny portion of the pancreas of adult mice [45]. In the human (adult) pancreas, a few ZSCAN4+ cells are present in the islets of Langerhans, ducts, oval-shaped cells (stellate cells), and acinar cells [45]. ZSCAN4+ cells rise drastically during acute pancreatitis, but once inflammation ceases, the cells revert to normalcy [45]. This results in a drop in ZSCAN4 levels [45], clearly indicating the expression of ZSCAN4 is strictly regulated.

In mouse embryonic development, the two-cell stage is one of the most important stages as the cell undergoes a drastic change in its epigenetic profile [40]. Furthermore, it coincides with ZGA, a transition event where the maternally acquired transcripts are depleted, and the zygotic genome gets activated through de novo transcription [5, 46]. ZGA is one of the major and vital events in the developmental stage of animals, and a significant burst occurs solely in the late two-cell stage [46]. Many studies were performed to identify novel genes expressed during ZGA and specifically during the two-cell stage [5, 46,47,48]. Zscan4 was identified to be one of the genes, along with other genes (Eif1a, U2afbp-rs, Hprt, Pdha1, Prps, Odc, and Cox7c), and is solely expressed during ZGA, indicating its utility as a potential marker for ZGA studies [5, 46,47,48].

To decipher the function of ZSCAN4 during the preimplantation stage of embryonic development, siRNA-mediated knockdown of ZSCAN4 was carried out. Knockdown revealed a delay in progression from the two-cell to the four-cell stage (Fig. 6). After the transient arrest, embryos attain the blastocyst stage. However, these blastocysts were functionally abnormal and thus failed to implant or proliferate in blastocyst outgrowth culture used for derivation of ESCs, pinpointing that ZSCAN4 is crucial for preimplantation development [5]. Further exploration of the functional aspect of ZSCAN4 uncovered its role in protection against DNA damage. Specifically, in the two-cell stage of mouse embryos, ZSCAN4 associates with microsatellite DNA (CA-repetitive sequences) and directly binds to nucleosome assembly [49]. Loss of ZSCAN4 in mouse embryos promotes DNA damage. Thus, ZSCAN4 acts as a microsatellite binding factor in a DNA sequence-dependent fashion and prevents cellular damage, which is triggered due to the increased genomic stress and transcriptional load during embryonic development [49].

Fig. 6
figure 6

Expression of ZSCAN4 during embryonic development. In mouse, Zscan4 is expressed at late 2-cell stage and is highlighted in orange. However, Knockdown of Zscan4 delays the 2-cell stage to 4-cell stage transition. In case of humans, ZSCAN4 is expressed at 4–8 cell stage and is absent in blastocysts

ZSCAN4 protein is also expressed in mice at late meiotic prophase during gametogenesis (spermatogenesis and oogenesis). Furthermore, ZSCAN4 expression was evident in spermatocytes and Sertoli cells of testes, whereas, in females, it was observed in the germinal vesicle stage of oocytes. Thus, ZSCAN4 may have a pivotal role in germline formation [50].

Role of ZSCAN4 in ESCs

In mouse ESCs, ZSCAN4 exhibits a highly heterogenous "spotted" (spot-in-colony) pattern of expression [51]. Moreover, the ZSCAN4 gene is exclusively expressed in only 5% of undifferentiated colonies of mESCs [5]. As previously described, its expression is tightly regulated such that, daily, only 3% of ESCs turned from being ZSCAN4 to ZSCAN4+, while 47% of ZSCAN+ ESCs attained ZSCAN4 state, contributing to an overall of 5% of ESCs expressing ZSCAN4 at any given point of time [6]. Furthermore, the Zscan4d expression is switched off right after the late two-cell stage, whereas Zscan4c and Zscan4f (to a small extent) expression is switched on during blastocyst outgrowth culture. Thus, ZSCAN4 might be a vital player in deriving ESCs (in vitro) from the inner cell mass [5, 51]. Notably, all cells trigger ZSCAN4 expression at least once by the end of nine passages [5, 51].

A vital role of ZSCAN4 in ESCs is to derepress constitutive and facultative heterochromatin globally (Fig. 7). As mentioned earlier, mouse ESCs fluctuate between two transient states, i.e., ZSCAN4 and ZSCAN4+. This alteration (ZSCAN4 to ZSCAN4+) causes immediate activation of heterochromatin derepression that eventually leads to clustering of pericentromeric heterochromatin around the nucleolus [40]. Moreover, this transient outburst of transcription of ZSCAN4 also leads to epigenetic changes involving histone modifications, especially increasing the levels of H3K27ac and reducing the levels of DNA methylation [40]. These alterations derepress heterochromatin, making it more accessible to perform transcription [40]. Notably, ZSCAN4 forms complexes with activating chromatin remodeling complexes such as SWI/SNF and HATs and repressing chromatin remodeling complexes such as LSD1/KDM1A, HDAC1, HDAC2, and NuRD that synchronously come together on heterochromatin [40, 52]. The formation of these multiprotein complexes leads to rapid derepression and repression of heterochromatin. The above-mentioned transient expression of ZSCAN4 induces transcription of genes due to the active conformation of heterochromatin, which can be considered a normal part of mammalian development and tissue homeostasis, but immediately reverts to its silent conformation [40]. More importantly, some of these epigenetic changes that occur due to the derepression of heterochromatin are mostly harmful to cells. So, the cells try to negate the epigenetic changes that affect their survival. This better explains the expression of ZSCAN4 in a transient manner in ESCs and for not being in the continuous ZSCAN4+ state [53]. Moreover, ZSCAN4 can also form complexes with KAP1, a central protein involved in regulating heterochromatin structure [40]. These data indicate that ZSCAN4+ cells continuously undergo chromatin remodeling and form complexes with chromatin remodelers/epigenetic modifiers. Thus, transient expression of ZSCAN4 is crucial for the epigenetic regulation of ESCs.

Fig. 7
figure 7

Role of Zscan4 in ESCs. ZSCAN4 is expressed transiently in ESCs. It has multidimensional roles such as repression of protein translation, elongation of telomere, remodeling of chromatin structures, enhancing cell survival associated with DNA damage and enhancing the self-renewal and pluripotentiality of ESCs

Another critical role of ZSCAN4 is the inhibition of global translation (Fig. 7). In a two-cell stage embryo, expression of eukaryotic translation initiation factor 1A (Eif1a) was increased with the increase in expression of ZSCAN4. Therefore ZSCAN4 was thought to be involved in translation [46, 54, 55]. Interestingly, another study reported that it is not overexpression of Eif1a but Eif1a-like genes responsible for global repression of protein translation. Besides, Eif1a-like genes are highly expressed in two-cell stage embryos and during the ZSCAN4+ state of ESCs [56]. Notably, the suppression of protein synthesis was observed in the ZSCAN4+ state of ESCs. This indicates that Eif1a-like genes and ZSCAN4 may play a role in global translation repression in ESCs.

A study reported that the two-cell stage of mouse embryos expresses both MERVL (murine endogenous retroviruses with leucine tRNA primer) and ZSCAN4 [57, 58]. A similar scenario is also observed in the case of mESCs. In MERVL+ZSCAN4+ cells, demethylation of DNA occurs globally. As soon as cells exit this transient phase, the methylation is restored. However, the genomic imprints are lost, causing inhibition of translation of DNA methyltransferases (Dnmts). These studies emphasize that the ZSCAN4 plays a vital role in repressing global protein synthesis. However, further exploration is required to understand the other cellular and metabolic changes that occur during this process [40, 57,58,59].

ESCs have two remarkable characteristics, namely self-renewal and pluripotency [60]. In ESCs, overexpression of Zscan4c enhances self-renewal. Also, ZSCAN4 activation is pivotal as ESCs tend to lose their unlimited proliferation ability upon its depletion [6]. It is reported that ESCs regulate self-renewal via glycogen synthase kinase (GSK) and phosphatidylinositol 3-kinase (PI3K) pathway [61, 62]. In accordance with this, ZSCAN4 expression was enhanced with inhibition of GSK. Importantly, ZSCAN4 was recognized as a downstream target of PI3K. In addition, P110α is a pivotal catalytic isoform of PI3K, and inhibition of P110α downregulates the expression of ZSCAN4 [19, 34]. Additionally, vitamin A was reported to upregulate the expression of ZSCAN4 to increase self-renewal in mESCs, through the activation of the PI3K/AKT cascade [63,64,65]. All these data indicate that activation of the PI3K cascade can activate the expression of ZSCAN4 [19, 34]. Zscan4c also plays a crucial role in maintaining the pluripotency properties of ESCs. During the long-term culturing of ESCs, the potential to develop into an embryo declines gradually [66]. Hence, rejuvenation of ESCs is required, which can be acquired by manipulating ZSCAN4 expression. In ESCs, the potential hike after repeated activation of ZSCAN4 or exposure to ZSCAN4 protein increases the ZSCAN4+ state, eventually improving the quality of ESCs [66]. These studies suggest that ZSCAN4 has a significant role in maintaining self-renewal and pluripotency properties in mESCs (Fig. 7).

One of the main reasons behind the transient expression of ZSCAN4 is that its continuous expression will lead to hyper-recombination of telomeres, which will eventually cause cellular senescence over time [67]. In ESCs, ZSCAN4 is involved in stabilizing the genome and the elongation of telomeres [6]. Telomeres are heterochromatin structures with a repetitive stretch of 5'-TTAGGG-3' (in humans), and specific proteins that protect the chromosome ends from damage and aging [68]. An appropriate telomere length is indispensable for the self-renewal and pluripotency of ESCs [17]. The ZSCAN+ (transient) state of ESCs is associated with rapid telomere extension by telomere recombination and upregulation of genes that are unique for meiosis-specific homologous recombination. These genes are co-localized with ZSCAN4 on telomeres. In ESCs, knockdown of ZSCAN4 causes telomere shortening, karyotypic abnormalities, spontaneous sister chromatid exchange (SCE), and reduced cell proliferation until cells reach senescence [6]. The impact of knockdown is observed gradually and ultimately leads to apoptosis and depletion of cell proliferation after seven to eight passages. Alternative lengthening of telomeres (ALTs) is an alternative mechanism for telomere elongation, which is independent of telomerase [69] and depends solely on homologous recombination [70]. Another study reported that Ddb1- and Cul4-associated factor 11 (Dcaf11) is required to activate ZSCAN4 in early embryos and ESCs [71]. Interestingly, Dcaf11 is involved in ALT-mediated telomere lengthening [71]. Hence, ZSCAN4 plays a pivotal role in maintaining telomere elongation, most likely through ALT-mediated pathway.

Furthermore, in mESCs, ZSCAN4 inhibits DNA methylation at the global level to promote telomere elongation (Fig. 7) [17]. It directly represses Dnmt1 and Uhrf1, which are major units of the DNA methylation complex. Mechanistically, ZSCAN4 is required to recruit Dnmt1 and Uhrf1 and promote the degradation of both these components via the ubiquitination pathway. Blockade of DNA demethylation leads to the prohibition of telomere elongation, which is directly linked to the expression of ZSCAN4 [17]. As stated before, mESCs undergo drastic epigenetic alterations to control transient activation and repression of specific heterochromatin regions, a process that might be an inherent part of a unique mechanism for the maintenance of genome stability [40]. Hence, ZSCAN4 plays a crucial role in telomere elongation and genome stability.

Intriguingly, the lengthening of telomeres by ZSCAN4 follows the ALT mechanism, which is independent of telomerase activity, as mentioned earlier [69]. It is reported that ZSCAN4 plays a pivotal role in the recruitment of other proteins (SPO11, DMC1, SMC1β, and TRF1) to chromatin to facilitate telomere recombination [6]. Furthermore, karyotypic abnormalities such as random chromosome fusions and deletions were observed solely due to the knockdown of ZSCAN4. In ESCs, ZSCAN4 decreases the chances of non-telomeric SCE and inhibits spontaneous SCE. Moreover, overexpression of ZSCAN4 induces homologous recombination of genes associated with meiosis. Further, SPO11, an enzyme that facilitates double-strand DNA breaks (DSBs) along with DMC1 (RecA homolog), which is a recombinase, is requisite for meiotic recombination [72,73,74]. It has been reported that ZSCAN4 foci together with SPO11 and DMC1 are located on telomeres. In addition, ZSCAN4 foci and SPO11 are co-localized in the G2 phase of the cell cycle. Taken together, ZSCAN4 expression in ESCs cultures helps in inhibition of cell crisis and the maintenance of normal karyotype even after multiple cell divisions.

Similar to ESCs, parthenogenetic embryonic stem cells (pESCs) also showed telomere elongation with increased expression of ZSCAN4. pESCs are generated from parthenogenetically activated oocytes [75]. On the other hand, in mESCs cultures, feeders such as mouse embryonic fibroblasts (MEFs) promote the expression of ZSCAN4 by the production of Fst1 and BMP4. Thus, indicating the role of feeders in maintaining telomere length and genomic stability and eventually self-renewal and pluripotency of ESCs [76].

Under a feeder-free milieu, mouse ESCs express ZSCAN4 in the presence of LIF and serum. However, the ZSCAN4 expression drops drastically after the induction of differentiation around day four. ZSCAN10 then regulates the expression of ZSCAN4 and several other genes crucial for differentiation in ESCs [43]. Both ZSCAN4 and ZSCAN10 share a common domain and regulate pluripotency in mouse and human ESCs [43, 77, 78]. It is reported that retinoic acid can simultaneously induce differentiation and upregulate ZSCAN4 expression, which contrasts with previous reports [79]. Taken together, the induction of differentiation in ESCs culture can upregulate or downregulate the expression of ZSCAN4, depending on the treatment of differentiation-inducing agents, and requires further detailed investigation.

The identification of interacting partners can be beneficial in deciphering the function associated with ZSCAN4. Firstly, Lysine-specific histone demethylase 1A (LSD1) plays a crucial role in maintaining the right balance of epigenetic profile, thereby enhancing the interaction between ZSCAN4 and other binding partners. Right before telomere elongation, heterochromatic marks at telomeres are erased by LSD1 [34, 77]. Also, C-terminal binding protein 2 (Ctbp2) interacts with ZSCAN4 by associating Ctbp PXDLS binding motifs that are well-established in the ZSCAN4 family and thus can act as a transcriptional repressor [34, 77, 80]. In addition, Histidine triad nucleotide-binding protein 1 (HTNT1) associates with ZSCAN4 and plays a crucial role in cell survival. Oxidative stress and DNA-damaging agents upregulate the expression of ZSCAN4. Thus, the interaction between HTNT1 and ZSCAN4 protects the cell from damage, increasing cellular viability [19]. In ESCs, TBX3 is also involved in the upregulation of the ZSCAN4+ state. Also, it is reported that TBX3 and ZSCAN4 are associated with self-renewal and telomere maintenance in ESCs [81]. Hence, ZSCAN4 is reported to interact with many partners such as LSD1, Ctbp, HTNT1 and TBX3 to maintain the self-renewal of ESCs and cell viability.

Interestingly, the expression of ZSCAN4 is confined to a subpopulation of ESCs [82]. The expression pattern of ZSCAN4 inside a stem cell colony is uneven and appears towards the center of the colony. Furthermore, cells expressing ZSCAN4 tend to form clumps [82]. The cell location and its pattern inside a colony can provide a hint for better interpretation of cell differentiation, meta-stable profile of cells, and cell analysis. This will further help understand the biological significance of ZSCAN4+ ESCs exhibiting specific pluripotent-like features [82]. Furthermore, this setup can help decipher the role of ZSCAN4 in the process of morphogenesis. Thus, ZSCAN4 has various functions in ESCs, and more exploration is required to attain a deeper understanding of its functionality.

Lastly, Zscan4 is also entailed in cell survival and responds to DNA damage in ESCs (Fig. 7). The addition of zeocin and cisplatin (DNA damage-inducing agents) are reported to increase the expression of Zscan4 in a dose-dependent fashion [34]. These two compounds induce double-stranded DNA breaks, leading to cell cycle arrest in the G2/M phase. In addition to this, expression of Zscan4 is increased in the late S-phase/early G2 phase of the cell cycle, thereby pinpointing a selective function for Zscan4 during the G2 checkpoint. Altogether, Zscan4 responds to DNA damage of ESCs and aids in cell survival.

Role of ZSCAN4 in iPSCs

Studies have reported that ZSCAN4 has a crucial role in reprogramming and maintaining iPSCs (Fig. 8). Similar to ESCs, ZSCAN4 is involved in stabilizing and maintaining the length of telomeres in iPSCs [7]. As mentioned earlier, a telomere is a short, repeating DNA sequence at the end of each chromosome having a sequence of 5’-TTAGGG-3' (in humans) [68]. The telomere helps in chromosomal stability and prevents the ends of the chromosome from fusing together [83]. Moreover, the telomere shortens with each replicative cycle because of the semi-conservative mechanism of replication [84]. As the telomeres shorten, the cells proceed towards senescence. Since stem cells can self-renew indefinitely, their telomere length has to be regulated [85]. Primarily, in iPSCs, the length of telomeres is regulated either by telomerase-dependent or -independent mechanisms [86]. Telomerase is a ribonucleoprotein that has two functional components, namely telomerase reverse transcriptase (tert) and telomerase RNA component (terc) [87, 88]. The telomerase-independent mechanism is called alternate lengthening of telomere (ALT), which involves recombination [89]. The transcription factor ZSCAN4 is reported to be one of the members involved in the ALT mechanism of telomere elongation [6, 7]. ZSCAN4 regulates the telomere by a recombination mechanism called telomere sister chromatid exchange (T-SCE) [6, 7]. Studies show that telomerase is dispensable for reprogramming but is essential for maintaining iPSCs [7, 90]. In accordance with this, even in the absence of telomerase, fibroblasts were reprogrammed to iPSCs successfully, with pluripotency characteristics [7]. Interestingly, in the absence of telomerase, the expression of ZSCAN4 was upregulated in iPSCs, indicating that the ALT mechanism was activated to regulate telomeres [7]. This activation of ZSCAN4 for telomere regulation can also be achieved by crotonylation of histones [90] or by Dcaf11, as mentioned earlier [71], which further improves the quality of iPSCs. Moreover, knockdown of telomerase and ZSCAN4 in iPSCs resulted in shorter telomere length, with increasing passages than wild-type iPSCs [7]. This telomere recombination by ZSCAN4 requires relatively relaxed chromatin and global DNA demethylation [7, 58]. Similar to mESCs, in the case of reprogramming, the global DNA demethylation is brought about by the downregulation of Dnmts by activation of ZSCAN4 and the MERVL transcriptional network [7, 58]. Moreover, studies show that ZSCAN4 interacts with Tet2 (a protein belonging to the TET family) and promotes DNA demethylation, thereby increasing ZSCAN4-mediated T-SCE [91, 92]. This ZSCAN4-Tet2 interaction is also found to be regulating proteasome activity and cellular metabolic machinery, shifting from oxidative phosphorylation to glycolysis, which is crucial for somatic cell reprogramming to iPSCs [92, 93]

Fig. 8
figure 8

Role of ZSCAN4 in iPSCs. Enumerating the advantages of inclusion of ZSCAN4 in the reprogramming cocktail to generate iPSCs. Inclusion of ZSCAN4 along with core reprogramming factors results in enhanced efficiency of reprogramming, fewer single nucleotide polymorphisms in resulting iPSCs

Another significant role of ZSCAN4 is enhancing the efficiency of reprogramming of iPSCs. Studies show that the inclusion of ZSCAN4 in the reprogramming cocktail, along with Yamanaka factors (OSKM), improved the reprogramming efficiency significantly [8, 9]. Also, treatment of MEFs with cell extracts of mouse ESCs overexpressing ZSCAN4 enhanced the efficiency of reprogramming [94]. Moreover, ZSCAN4 lead to increased genome stability, longer telomeres, better karyotypes, and quality iPSCs, as deduced by tetraploid complementation assays (TCA) compared to OSKM-mediated iPSCs [8, 9]. Also, fewer single nucleotide variations were reported in iPSCs when ZSCAN4 was included in the reprogramming cocktail [95]. Telomere regulation is vital for generating quality iPSCs that give a higher percentage of live-born pups in TCA. Telomerase-dependent regulation mechanism is slower and requires many cell divisions [96]. The inclusion of ZSCAN4 enhances the quality of iPSCs by regulating telomere length rapidly and giving rise to a higher percentage of live-born pups in TCA [8]. Furthermore, during OSKM-mediated reprogramming, the expression of ZSCAN4 is observed in a fraction of cells only during the later stages of reprogramming [8, 9]. In contrast to other factors that require 8–10 days of forced expression, expression of ZSCAN4 is only needed during the initial stages of reprogramming (first 4 or 7 days) to generate iPSC colonies efficiently [9]. This activation of ZSCAN4 can also be brought about by the inclusion of small molecules in the reprogramming cocktail [97]. The oncogenic factor c-MYC in the Yamanaka cocktail can be replaced with ZSCAN4 without compromising the reprogramming efficiency [9]. Moreover, the inclusion of ZSCAN4 activated early embryonic genes, namely Prame16, Trim43a, Trim43b, Tcstv1, and many more, during reprogramming [9]. An efficient regulatory system is essential at the gene and protein levels for this tight regulation to be possible. At the gene level, ZSCAN4 is negatively regulated by Rif1 [67] and at the protein level by E3 ubiquitin ligase RNF20 [35]. Thus, ZSCAN4 assists in telomere regulation in a telomerase-independent manner and helps generate quality, genetically stable iPSCs during reprogramming with high efficiency.

Role of ZSCAN4 in Cancer

The role of ZSCAN4 in maintaining telomeres, which paves the way for the indefinite self-renewal capability of stem cells, is emphasized in earlier sections. Cancer cells are also similar to stem cells in this aspect. Therefore, shortly after the role of ZSCAN4 was identified, researchers were keen to deduce its role in the immortality of cancer cells. Various functions of ZSCAN4 in cancer are depicted in Fig. 9. The first study in 2014 studied the expression of ZSCAN4 in telomerase positive and ALT pathways in cancer cells and found out that ZSCAN4 has a role in telomere maintenance of cancer cells in a telomerase-independent manner [41]. It has been further confirmed that ZSCAN4 has a pivotal role in telomere lengthening in non-seminoma testicular germ cell tumors [98], most likely through an indirect interaction with Trf1 [42]. Moreover, it has been found that ZSCAN4 interacts with two of the shelterin complexes, Rap1 and Trf1, to perform this function [41, 42]. The shelterin complex is a six-subunit complex protein that plays a significant role in telomere maintenance [83]. Furthermore, it has been found that the expression of ZSCAN4 is upregulated in response to DNA damage induced by genotoxic chemotherapeutic drugs in cancer cells [99]. This upregulation is also essential for sustaining senescence-associated secretory phenotype, which does not favor the long-term survival of cancer patients [99]. On top of all these, ZSCAN4 alters the epigenetic state and facilitates chromatin remodeling in cancer stem cells (CSCs) to promote the CSC phenotype [10]. Also, it enables the spheroid formation of CSCs and enhances the expression of some of the CSC (BMI1 and CD44) markers, enumerating the usage of ZSCAN4 as a potential marker for CSCs [10]. Moreover, ZSCAN4 also enhances the expression of certain pluripotency-associated genes in cancer cells, namely OCT4, SOX2, KLF4, and NANOG, by facilitating a functional histone hyperacetylation of histone H3 [10]. All these studies suggest that ZSCAN4 has a prominent role in maintaining the tumorigenic population of CSCs and can be a likely therapeutic target in cancer.

Fig. 9
figure 9

Role of ZSCAN4 in cancer. Demonstration of various roles of ZSCAN4 in cancer cells. ZSCAN4 is involved in regulating telomere length in case of cancer by interaction with two of the shelterin complexes trf1 and rap1. Apart from this, ZSCAN4 is also reported to enhance spheroid formation in a cancer stem cells and regulates DNA damage response (The three-dimensional structures of Trf1, Rap1 and ZSCAN4 shown in the figure are obtained from Alphafold database)

Conclusion and Future Perspectives

ZSCAN4 gene is expressed during the first tide of zygotic genome activation, and its expression is confined to mammals. It is a crucial regulator involved in gene activation waves during preimplantation. Therefore, it can serve as an invaluable resource to the scientific community endeavoring to understand the earliest stages of mammalian life. Further, the impact of ZSCAN4 on various developmental disorders should be investigated to strengthen our understanding. In mice, ZSCAN4 is expressed at the late two-cell stage, whereas, in humans, it is expressed during the four to the eight-cell stage of embryonic development. Moreover, its expression is transient and marked by an exceptional high expression peak followed by a drastic decline in its level, indicating that its expression is tightly regulated. Functionally, it has multidimensional roles in the regulation of transcription, translational inhibition, chromatin remodeling, reprogramming enhancer, pluripotency preservation, telomeres maintenance, genomic stabilization, cell differentiation, and survival in stem cells. ZSCAN4 exhibits a typical mosaic expression in mouse ESCs. Although the role of mouse ZSCAN4 is well-established, the significance of its expression in human cells needs further exploration to attain a deeper understanding of its functionality. This transcription factor also plays a vital role in reprogramming and the maintenance of iPSCs. Moreover, ZSCAN4 can act as a potent reprogramming enhancer. This will help the scientific community to overcome the reprogramming roadblocks [100] and to generate clinical-grade iPSCs more efficiently from a variety of different somatic cell sources [101, 102] using safe reprogramming methods [103,104,105].

Additionally, it can act as a c-MYC replacer during iPSC generation and improve the quality of iPSCs thus formed. Furthermore, ZSCAN4 also plays a pivotal role in regulating cancer stemness. Hence, it can be used as a therapeutic target to decipher its regulation and turnover in cancer cells. Additionally, online bioinformatics tools can predict the structure and explore gene regulatory networks. These will help the research community determine the interacting partners and elucidate their biological roles. In this review, the role of ZSCAN4 in embryonic development, ESCs, iPSCs, and cancer has been highlighted that would help the implementation of ZSCAN4 in various fields such as telomere biology, cellular reprogramming, stem cell biology, and cancer.