Introduction

Epigenetics is commonly defined as a genomic process that regulates the expression of genes without changing DNA sequences. Holliday hypothesized that epigenetics was heritable during mitosis and/or meiosis too without a change in DNA sequence. Meiosis can fix faulty DNA methylation, but certain patterns are still passed down to offspring [1]. In general, epigenetic events include DNA methylation, histone modification, histone readout, remodeling of the chromatin, and the impacts of noncoding RNA. The epigenome collaborates with other regulatory factors, such as transcription factors and noncoding RNAs, to regulate the expression or suppression of the genome to bring together different biological activities. Cellular signaling networks and external stimuli can potentially alter epigenetics. These effects are momentary as well as long-lasting. Considering the importance of epigenetics in affecting the functions of the cell, an improved understanding of both normal and pathological epigenetic processes can aid in the understanding of disease genesis and future therapeutic approaches, including cancer [2].

Cancer is the result of a complicated interaction between accumulative genetic mutations, epigenetic alterations, and environmental factors. A wealth of research endeavors has been dedicated to unraveling the intricate genomic profile of malignancies, ranging from the exploration of oncogene-driven signaling pathways to the investigation of mutation patterns across diverse cancer subtypes. Unlike genetic mutations, which involve changes in the DNA sequence, epigenetic modifications entail alterations in gene expression without permanently modifying the underlying genomic code. These epigenetic changes are particularly favored within cancer cells due to their reversibility and rapid adaptability, offering a dynamic avenue for therapeutic intervention [3].

CTCF, formally named CCCTC-Binding Factor, is a versatile zinc finger protein ubiquitously present in nearly all vertebrate tissues. Its initial recognition came as a negative regulator of the c-Myc gene [4, 5]. CTCF’s remarkable functionality stems from its 11 zinc fingers, which enable it to attach to DNA, binding to various evolutionarily conserved sites [6] (see Fig. 1). While CTCF is distributed throughout the genome, it predominantly resides within intergenic regions [7]. Interestingly, it can also establish connections with less conserved DNA sequences, leading to the discovery of cell-specific patterns of CTCF binding, frequently nestled within intronic regions [8]. As the most significant insulator element in vertebrates, CTCF serves as a pivotal chromatin architectural regulator, exerting its influence over an array of epigenetic and molecular processes [9]. Its architectural role enables CTCF to act as either a repressor or a transcriptional activator, while also functioning as a chromatin insulator, directly mediating interactions between enhancers, silencers, and promoters [6, 10]. CTCF is a multifaceted protein that exerts its influence on various aspects of genome regulation and organization. It plays pivotal roles in gene imprinting, chromosome × inactivation, and safeguarding unmethylated regions throughout the genome [11, 12]. Furthermore, CTCF orchestrates the creation and maintenance of topologically associated domains (TADs), which are genomic sectors characterized by enhanced intradomain interactions [13]. A fascinating aspect of CTCF’s functionality lies in its ability to facilitate the formation of chromatin loops, a consequence of CTCF-bound domains interacting across different genomic locations within a TAD [14, 15]. Notably, CTCF-driven loop formation is contingent upon the convergent orientation of its binding sites, as divergent sites lack this loop-generating capability [16]. CTCF also collaborates with cohesin to ensure the stable maintenance of these loops, albeit with distinct roles in orchestrating chromatin organization [17]. Beyond its fundamental role in genome architecture, CTCF’s involvement extends to disease pathogenesis. Aberrant CTCF binding has been implicated in various cancers, including leukemia [18], gastrointestinal cancer [19], lung cancer [20], cervical carcinoma [21], and others. Importantly, CTCF’s impact on disease transcends mere gene regulation; it serves as a guardian against DNA methylation and the propagation of repressive histone marks in tumor suppressor gene promoter regions. Loss of CTCF binding can result in epigenetic silencing, further emphasizing its critical role in maintaining genomic integrity and gene expression [22].

Fig. 1
figure 1

Schematic Structure of CTCF

CTCF plays an important function in the formation and maintenance of chromatin architecture in epigenetics. CTCF binding sites are frequently discovered at chromatin domain boundaries, such as those separating active and repressive areas. CTCF maintains the three-dimensional structure of the genome by binding to these locations, which helps regulate gene expression and insulate genomic areas [6]. CTCF has also been linked to the regulation of DNA methylation, a critical epigenetic alteration. CTCF binding sites have been found in studies to be particularly susceptible to DNA methylation alterations. CTCF binding loss can result in DNA hypermethylation at CTCF sites, which can have serious consequences for gene expression and cellular function [23]. Furthermore, CTCF has been linked to genomic imprinting and X-chromosome inactivation (XCI), both of which are necessary for normal development and gene regulation. By binding to imprinted areas and participating in the construction of chromatin loops, it aids in the maintenance and establishment of parent-specific allele expression patterns [6].

CTCF acts as a boundary element by creating insulator regions that demarcate chromatin domains and prevent the spread of epigenetic modifications. CTCF’s role as a boundary element is well-established in the literature, including a study by Rahme et al [24] which demonstrated its importance in defining the 3D structure of the genome and regulating gene expression by establishing boundaries between functionally distinct genomic regions.

Epigenetic mechanisms in cancer

DNA methylation, histone changes, non-coding RNA modulation, and remodeling of chromatin are some of the epigenetic pathways involved in cancer [25] (Fig. 2). By gene expression regulation and modifying chromatin structure, these pathways play critical roles in the onset and progression of cancer.

Fig. 2
figure 2

Epigenetic mechanisms in cancer

DNA methylation and demethylation are important regulators of gene expression. Adding a methyl group to the DNA molecule, particularly at cytosine residues in the setting of CpG dinucleotides, is known as DNA methylation. Demethylation, on the other hand, is the removal of methyl groups from the DNA molecule. DNA methylation patterns can have a significant impact on gene expression. In general, methylation of DNA at gene promoter regions is linked to repression or silencing of genes. Methylation makes a gene less accessible to the transcriptional machinery, resulting in lower gene expression. This occurs because methylation DNA can recruit proteins that change histones, which results in chromatin condensation and restricted access to transcription factors [26]. Demethylation, on the other hand, might result in gene activation or overexpression. Active DNA demethylation mechanisms can remove methyl groups from specific regions, enabling genes to be transcribed.

There are many DNA demethylation methods, including passive demethylation, which occurs without adding methyl groups, and active demethylation mechanisms involving enzymatic activities [27]. The role of DNA methylation and demethylation in gene expression is notably important in developmental processes, tissue-specific gene regulation, and illness. DNA methylation patterns differ between cell types and play a role in cell differentiation and cell identity formation. Aberrant DNA methylation patterns have been linked to a variety of disorders, including cancer, where aberrant methylation can contribute to tumor suppressor gene silence or oncogene activation.

A fundamental characteristic of DNA methylation lies in its reversibility and dynamic nature, making it one of the pivotal features in epigenetic regulation, allowing organisms to respond to both endogenous and foreign cues. The dynamic modulation of DNA methylation levels is mediated by several mechanisms [28]. The RNA-directed DNA methylation mechanism (RdDM) establishes DNA methylation from scratch. Once created, it can be sustained by various routes which depend on the sequence context in which DNA methylation occurs. Methylation at CG and CHG sites (where H represents A, C, or T) is upheld through the actions of methyltransferase 1 (MET1) and chromomethylase 3 (CMT3), respectively. Meanwhile, CHH methylation finds maintenance through either the RdDM or CMT2 pathways [29, 30]. However, elimination of DNA methylation primarily occurs through two distinct mechanisms: passive and active. In instances where DNA methylation cannot be adequately preserved, it may diminish passively over multiple rounds of cell division. Conversely, catalytic enzymes actively participate in the erasure of DNA methylation, ensuring the dynamic nature of this epigenetic modification [31, 32]. Gene expression is driven through complex mechanisms, including the binding of transcription factors to DNA, and coordinated alterations in chromatin structure. Chromatin, consisting primarily of histone proteins bound to DNA, forms larger complexes known as nucleosomes. Each nucleosome comprises two copies of core histones—H2A, H2B, H3, and H4—each possessing an accessible amino-terminal tail rich in lysine and arginines. A pivotal avenue for gene regulation involves modifications to these histone proteins. These modifications serve as early indicators of epigenetic regulation, and one method for investigating them is chromatin immunoprecipitation (ChIP). ChIP allows for the examination of chromatin structure surrounding a specific DNA sequence while monitoring DNA–protein interactions, shedding light on the regulation of gene expression. Using histone modification-specific antibodies, this approach finds and quantifies genomic areas harboring the targeted histone modifications [33]. Histone alterations are important in chromatin remodeling [34]. Chromatin remodelers use the energy from ATP hydrolysis to rearrange nucleosomes, displace them, or exchange canonical histones with modified variants [34]. Histone alterations have two major modes of action [35]. First, histone changes can have a direct impact on the overall structure and stability of chromatin. Acetyl groups on histones, for example, can resist negatively charged DNA, resulting in a more flexible chromatin structure accessible to transcription factors and regulatory proteins [36]. Modifications such as methylation or phosphorylation, on the other hand, can result in a more packed and compacted chromatin structure, making it less readily available for gene expression [36]. Second, histone alterations can serve as signaling molecules for numerous proteins involved in chromatin remodeling. Specific histone modifications can act as docking sites for proteins that have specialized domains that recognize and bind to changed histones [35]. These interactions can then promote the recruitment of chromatin remodelers and other regulatory factors to the chromatin, resulting in changes in nucleosome positioning and chromatin structural remodeling [35].

lncRNAs are thoroughly expressed and have important functions in gene regulation, according to evidence gathered over the last decade. Recent research has begun to uncover how lncRNA biogenesis is different from that of mRNAs and is connected to their different subcellular localizations and activities. Long non-coding RNAs (lncRNAs) exhibit a diverse range of functions within the cell, impacting various aspects of chromatin dynamics, nuclear body formation, cytoplasmic mRNA stability and translation, and modulation of signaling pathways. Their influence spans a wide spectrum, contingent upon their localization and interactions that are specific with DNA, RNA, and proteins. These multifaceted roles play a pivotal part in shaping gene expression across various biological and pathophysiological scenarios, encompassing neurological diseases, immunological responses, and cancer [37].

lncRNAs exert control over gene expression through a multitude of mechanisms. They can regulate chromatin structure and function, and modulate the transcriptional activity of both nearby and distant genes, achieved through intricate interactions with DNA, RNA, and proteins. Furthermore, lncRNAs can impact RNA processing, stability, and translation processes. In the realm of cellular organization, these enigmatic molecules contribute to the creation and orchestration of organelles and nuclear condensates [37]. One intriguing factor of lncRNA function is their ability to swiftly modulate gene expression through RNA-mediated chromatin alterations. The negatively charged RNA molecules can neutralize positively charged histone tails, resulting in chromatin de-compaction—a mechanism akin to a rapid gene expression switch. Both cis-acting and trans-acting nuclear lncRNAs interact with DNA, thereby reshaping the chromatin landscape. This influence can be indirect, mediated by their affinity for proteins capable of binding both RNA and DNA or direct, as they bind DNA in a sequence-specific manner [38].

Cancer cells are extraordinarily adaptable to various survival strategies, which is most likely due to their capacity to perceive signals differently than normal cells. Cancer cells appear to be continually sampling, choosing, and adapting signalling pathways to promote their growth. Based on the wealth of present data, it is now clear that numerous signalling channels eventually converge, maybe temporally and spatially, onto dynamic processes that are DNA template-dependent [39]. Given the intricate nature of the eukaryotic genome and its tightly packaged state, the process of genome regulation involves a series of energy-dependent subevents orchestrated by chromatin remodeler proteins. These remodelers serve as crucial gatekeepers, playing a pivotal role in determining the accessibility of supporting factors to nucleosome DNA, thereby enabling a vast range of essential biological functions. Consequently, cancer cells possess a unique capability to manipulate their genome to sustain oncogenic phenotypes, often achieved through the aberrant expression or epigenetic alteration of remodeler proteins. In a novel approach, oncogenic cells can selectively harness a multi-subunit remodeler proteome to their advantage, facilitating the maintenance of their oncogenic traits [39].

The epigenetic mechanisms that are seen in different kinds of cancers are discussed below:

Breast cancer

Epigenetic mechanisms, such as DNA methylation and histone modifications, play a pivotal role in breast cancer. Some studies [40] highlight how DNA methylation regulates microtubule-associated tumor suppressor 1 and affects gene expression in breast cancer. Additionally, it is emphasized that miR-193a targets MLL1 mRNA, reducing H3K4me3 content of chromatin and hampering cell proliferation and viability [40, 41]. In breast cancer, the overexpression of Oct4, regulated by histone modifications, has been associated with disease progression [40]. These studies collectively underscore the significance of epigenetic alterations in the initiation and progression of breast cancer. One study [42] demonstrates that the collective action of ROS and the MAPK signaling pathway, particularly the ERK/Snail axis, leads to enhanced epigenetic silencing of specific genes, such as CDH1. Notably, the application of hydrogen peroxide exacerbates this effect by increasing the activity of DNA methyltransferases. As a result, the CDH1 gene is repressed, which is associated with cancer progression. This research underscores the intricate mechanisms through which oxidative stress and signaling pathways contribute to epigenetic modifications in breast cancer, potentially offering insights into the development of therapeutic strategies for this disease. Another study [43] comprehensively dissects how miRNAs play a crucial role in regulating the physiology and functions associated with breast cancer. It sheds light on the therapeutic potential of miRNAs in breast cancer treatment. The research offers a deeper understanding of the intricate regulatory networks involving miRNAs and their implications for therapeutic interventions in breast cancer, emphasizing the importance of miRNAs as potential targets for novel therapeutic strategies.

Prostate cancer

In prostate cancer, SOX2, a key oncogenic factor, is overexpressed due to histone modifications, preventing apoptosis, and promoting cell proliferation [44]. Epigenetic regulation of pluripotency inducer genes NANOG and SOX2 is another essential aspect of prostate cancer development [45]. The Hedgehog signaling pathway and its role in prostate cancer androgen independence have been explored, emphasizing the epigenetic components of this pathway [46]. Furthermore, Paederia foetida treatment has been shown to induce anticancer effects by modulating DNA methylation and pro-inflammatory cytokine gene expression in human prostate cancer [47].

Lung cancer

In addition to the role of DNA methylation in lung cancer [46, 48] suggests that reversible methylation modifications of arginine and lysine in nuclear histones play a crucial role in human colon cancer. This implies that histone modifications, such as methylation, have significant epigenetic roles in lung cancer as well, potentially affecting gene expression and contributing to cancer progression.

Colon cancer

Some studies [49] discuss PAX9 reactivation through inhibiting DNA methyltransferase in oral squamous cell carcinoma. Although this reference primarily focuses on oral cancer, it suggests the importance of DNA methylation as an epigenetic regulatory mechanism in cancer. This finding could potentially be relevant to colon cancer as well, where DNA methylation often plays a significant role in regulating gene expression.

A study [50] primarily focuses on the role of histone modifications in controlling the expression of the clusterin gene. They also examined the effects of ectopically expressing a nuclear isoform of clusterin, which resulted in inducing cell death. This suggests that clusterin, a gene associated with various cellular processes, is tightly regulated by epigenetic modifications in colon cancer, and manipulating its expression may have implications for cancer therapy. Specific Arginine and Lysine methylation modification on histone also plays a major role in the progression and regulation of Colon cancer [51].

Oral squamous cell carcinoma

A study [52], although primarily related to colon cancer, provides insights into epigenetic drift and histone modifications in cancer. Epigenetic drift is a phenomenon where gradual changes in epigenetic marks, such as histone modifications, contribute to cancer progression. While this reference discusses colon cancer, it’s a concept that may have implications for other cancers, including oral squamous cell carcinoma. Epigenetic modifications, such as histone changes, are critical in the regulation of gene expression in cancer, including oral cancer.

The studies collectively highlight the significance of various epigenetic mechanisms in different types of cancer. These mechanisms include DNA methylation, histone modifications, miRNA regulation, and the impact of natural compounds, such as thymoquinone [53, 54] and Paederia foetida [47], in regulating gene expression and affecting disease progression. Understanding and targeting these epigenetic modifications are essential for improving cancer diagnostics and developing novel therapeutic strategies. Some studies [55] explore the interplay between microRNA miR-148a and the DNA methyltransferase DNMT1 in the context of cell biology. The study demonstrates that miR-148a exerts an antagonistic effect on DNMT1 by downregulating DNMT1 mRNA levels. This downregulation of DNMT1 results in reduced cell proliferation and survival. The research suggests that miR-148a may act as a key regulator of DNMT1, offering insights into the epigenetic control of cell growth and survival mechanisms, which may have implications in the context of cancer and other diseases. One study [56] reveals that miR-193a targets and significantly reduces the expression of MLL1 mRNA, leading to a substantial decrease in MLL1 protein levels. This downregulation of MLL1 results in a marked reduction in H3K4me3, an epigenetic mark associated with active gene transcription, in chromatin. Consequently, this disruption in epigenetic regulation inhibits cell proliferation and compromises cell viability. The research highlights the critical role of miR-193a in modulating MLL1 and, in turn, the epigenetic landscape of chromatin, shedding light on its potential implications for controlling cellular growth and survival.

The role of CTCF in epigenetic regulation

The 82-kDa protein known as CCCTC-binding factor (CTCF) has 11 zinc fingers. As a transcriptional repressor of the chicken c-myc gene, a regulatory gene that produces the c-Myc transcription factor, it was initially discovered. In eukaryotes, CTCF is highly conserved and widely expressed. An N-terminal domain, a C-terminal domain, and a central domain region with 11 zinc fingers comprise the three distinct domains that make up CTCF. The zinc finger domain is very well preserved, emphasizing its significance in CTCF activity. CTCF employs these zinc fingers in concert to bind the genome. Different post-translational changes apply to all three domains. Mammalian genomes include between 55,000 and 65,000 binding sites for CTCF, of which about 50% are intergenic, 35% are intragenic, and the remaining sites are promoter proximal, demonstrating that CTCF may organize chromosomal architecture by attaching to diverse places. Additionally, CTCF stabilizes nuclear architecture by binding it to the nuclear matrix [57].

The main role of the CCCTC-binding factor (CTCF) in mediating the intricate interplay between the organization of the nucleus and the expression of the gene was discovered through a combination of advanced microscopy techniques and 3C-related methodologies. Among vertebrates, CTCF stands out as the primary insulator protein. Initially identified as a transcription factor with the intriguing capability to either activate or repress gene expression in heterologous reporter assays, CTCF later revealed its insulator-like properties. These include the capability to obstruct enhancer-promoter communication or shield transgenes from the adverse effects of chromosomal position alterations induced by heterochromatin spreading, shedding light on its multifaceted role in genome regulation [58].

In many aspects, CTCF is a remarkable transcription factor since it is a widely expressed, necessary protein. It was once thought to be a transcriptional repressor, but it was later shown to also function as an activator. The CTCF transcription factor binds to tens of thousands of genomic locations, among them some are highly conserved and specialized to tissues. It can stop transcription and function as an insulator, repressor, and activator of transcription. The protein CTCF binds to enhancers, gene promoters, and the inside of gene bodies. It may entice several more transcription factors to chromatin, such as tissue-specific transcriptional activators, repressors, cohesin, and RNA polymerase II, and it creates chromatin loops. Most significantly, it possesses an insulator function, which may disrupt communication between an enhancer and gene promoter and hinder transcriptional activation when placed between them [59].

CTCF performs a substantial number of tasks. CTCF acts as a barrier or boundary element between different chromatin domains, preventing the spread of chromatin modifications or regulatory signals from one domain to another and this helps in maintaining the integrity and independence of distinct regulatory regions [60]. CTCF has the ability to block the interaction between enhancer elements and promoters by forming a physical barrier. This prevents enhancer-promoter communication, influencing gene expression patterns and preventing inappropriate activation or repression of target genes [58]. CTCF plays a critical role in forming chromatin loops, which bring distant regulatory sequences, such as enhancers and promoters, closer. These loops facilitate proper gene regulation by enabling enhancer-promoter interactions and regulating the 3-D genome architecture [61] (Fig. 3). CTCF participates in genomic imprinting, the process of the differential expression of specific genes determined by their parental origin. CTCF binding at imprinted loci helps to establish and maintain allele-specific chromatin modifications and differential gene expression [14]. CTCF acts as a boundary factor, organizing the chromatin into distinct topological domains and preventing the spread of silencing signals [62]. CTCF plays a pivotal role in regulating the V(D)J recombination, a crucial process at antigen receptor loci. This regulation hinges on the modulation of chromatin accessibility, closely associated with active histone modifications and transcription levels. CTCF emerges as a significant influencer in the context of V(D)J recombination, as it governs not only the interactions between enhancers and promoters but also the compaction of the locus itself. Through these multifaceted mechanisms, CTCF exerts its control over the intricate process of V(D)J recombination, ensuring the precision and effectiveness of immune response mechanisms [58]. The complex function of CTCF in chromatin organization is essential for controlling the genome’s three-dimensional structure. The establishment of boundaries between chromosomes’ topologically associating domains (TADs) is one of CTCF’s primary roles [58]. Genomic areas known as TADs are more likely to interact with one another within the same domain. By attaching DNA regions referred to as CTCF binding sites, CTCF promotes the development of these connections. The formation of DNA loops by CTCF aids in the division of the genome into distinct TADs, which have been demonstrated to be important for the functional regulation of gene expression. In the mediation of interactions between regulatory components like enhancers and promoters, CTCF takes a role. It serves as a link between these components, putting them close together spatially to facilitate regulatory interactions and precise gene expression [58]. The CTCF protein has been linked to long-range chromatin interactions, X chromosome inactivation, and genomic imprinting [58]. It serves as a molecular scaffold, providing a framework for numerous genomic processes and making sure that the regulatory elements are arranged correctly in space.

Fig. 3
figure 3

CTCF complex plays important role in genomic organisation

Dysregulation of CTCF in cancer

Numerous studies have been conducted on the deregulation of CTCF in cancer, which has been linked to oncogenic processes and abnormalities in chromatin structure [62]. Its binding to DNA is a key component of CTCF dysregulation in cancer. Cancers frequently display an oncogenic CTCF binding signature, which is connected to altered gene expression and chromatin architecture [62]. Dysregulated CTCF binding can impair chromosomal function and result in abnormal gene expression patterns. CTCF binding and chromosome shape can both be affected by DNA methylation, which is frequently changed in cancer [63]. DNA hypermethylation, a characteristic of many cancer types, decreases CTCF binding and jeopardizes the insulation of nearby genomic structures.

In the study of molecular genetics, genetic mutations, and CTCF binding site aberrations are two separate but related issues. The protein, CTCF, controls the genes’ expression by attaching to the DNA regions known as CTCF binding sites. The development of illnesses, specifically chromosomal instability, and oncogenesis has been demonstrated to be affected by aberrations or changes in CTCF binding sites that can be related to a variety of genetic abnormalities. One study discovered an enrichment of CTCF binding site (CBS) hotspot mutations in tumors exhibiting chromosomal instability [19]. Most of the time, these mutations are found in cancer cells and frequently co-occur with nearby chromosomal abnormalities. This raises the possibility of a connection between abnormalities in the CTCF binding location and the onset or development of malignancies. Another study in the setting of T-cell acute lymphoblastic leukemia (T-ALL) investigated that global DNA methylation, gene expression, CTCF chromatin binding, or topologically associating domain (TAD) formation patterns are not significantly impacted by CTCF abnormalities in T-ALL [64]. However, T-ALL samples with CTCF abnormalities showed higher enhancer-oncogene interactions. This implies that T-ALL CTCF binding site abnormalities may affect particular enhancer-gene interactions, thereby influencing the carcinogenic process.

Changes in DNA and histone proteins that impact how the CCCTC-binding factor (CTCF) protein binds to DNA and performs its function are referred to as epigenetic modification of CTCF. The spatial organization and transcription of genes are tightly regulated by the regulatory protein CTCF [65]. It serves as a barrier to stop connections between various genomic areas by attaching particular DNA sequences known as insulator motifs. Epigenetic changes can affect CTCF’s ability to attach to its target sites and change how it functions as a regulator. One such change is DNA methylation, which entails giving the DNA molecule a methyl group. CTCF binding sites have been found to exhibit tissue-specific DNA methylation patterns, and changes in DNA methylation can affect CTCF binding and gene expression [66]. Histone modifications and DNA methylation both contribute to the modulation of CTCF function. DNA undergoes a coiling process around proteins known as histones, resulting in the formation of a complex structure referred to as chromatin. Histones can undergo several chemical changes, including acetylation, methylation, and phosphorylation, which can alter how accessible DNA is to the transcriptional machinery [66]. Studies have tried to describe the many histone alterations connected to CTCF enrichment. For instance, one study used sheep macrophages and attempted to create a database of histone alterations and CTCF-enriched borders by identifying many fundamental histone alterations linked to CTCF binding sites [23].

CTCF-mediated gene expression in cancer

CTCF (CCCTC-binding factor) is an important regulatory protein that plays a pivotal function in orchestrating gene expression by modulating the three-dimensional structure of chromatin. Chromatin is the complex of DNA and histone proteins that make up our chromosomes [60]. The spatial organization of chromatin within the nucleus is necessary for the proper regulation of gene expression. CTCF acts as a chromatin insulator, boundary element, and enhancer blocker, contributing to the establishment of specific chromatin interactions and controlling the accessibility of genes to transcriptional machinery [60]. One of the main roles of CTCF is to perform as a chromatin insulator. It can establish boundaries between distinct chromatin domains, preventing the spread of heterochromatin into active regions and vice versa. CTCF binds to specific DNA sequences called CTCF-binding sites (CBS) or insulator sequences and forms large loop structures that separate functionally distinct genomic regions. These loops are mediated by the formation of CTCF-mediated chromatin interactions (3D interactions) that restrict the communication between enhancers and promoters. The insulator function of CTCF is essential for maintaining cell-type-specific gene expression patterns [60]. CTCF also functions as an enhancer blocker. Enhancers are sequences of DNA that can activate the gene transcription when bound by transcription factors. They are frequently situated at a considerable distance from the genes they regulate, and CTCF plays a role in preventing improper enhancer-promoter interactions. By acting as an enhancer blocker, CTCF ensures that enhancers only interact with the appropriate target gene promoters, thereby fine-tuning gene expression [67]. CTCF has been used in disease treatment recently. Dysregulation of CTCF has been implicated in various types of cancer. Alterations in CTCF binding can lead to aberrant chromatin interactions, resulting in misexpression of oncogenes or tumor suppressor genes, contributing to tumorigenesis. CTCF has also been associated with several neurological disorders. For example, CTCF binding site mutations have been linked to the neurodevelopmental disorder Pitt-Hopkins syndrome. CTCF is involved in cardiovascular development and disease too. Altered CTCF-mediated chromatin interactions have been linked to cardiac dysfunction and other cardiovascular conditions. It has also been found that mutations in CTCF or disruption of CTCF-mediated chromatin interactions can lead to a range of genetic disorders and developmental abnormalities [68].

Aberrant CTCF binding has been increasingly recognized as a contributing factor to the development and progression of cancer. Breast cancer is one of the mostly occurring malignancies that affects women worldwide. Studies have shown that aberrant CTCF binding at specific genomic loci can disrupt the regulation of key oncogenes and tumor suppressor genes. For instance, a study revealed that altered CTCF binding at the 8q24 locus can lead to enhanced expression of the MYC oncogene, promoting breast cancer growth and metastasis [17]. A study identified differential CTCF binding patterns in prostate cancer patients compared to healthy controls, indicating its potential involvement in disease progression [69]. Another study that altered CTCF binding can disrupt long-range chromatin interactions between enhancers and promoters, leading to dysregulation of critical cancer-related genes in colorectal cancer cells [70]. It was also observed that aberrant binding of CTCF is associated with altered chromatin accessibility at pancreatic cancer risk loci, suggesting its potential role in disease susceptibility and progression [71]. A study showed that altered CTCF binding patterns contribute to the dysregulation of oncogenes and tumor suppressors in glioblastoma, highlighting its significance in disease development [72]. Lung cancer stands as the foremost cause of cancer-related fatalities worldwide, and within its pathogenesis, CTCF has emerged as a potential contributor. Additionally, a separate study has revealed that irregular CTCF binding can instigate disruptions in chromatin interactions, ultimately resulting in altered gene expression patterns in lung cancer cells [73].

CTCF as a diagnostic and prognostic marker

CTCF has emerged as a promising diagnostic and prognostic marker in cancer research. Its role in epigenetic regulation has been increasingly recognized, and its potential as a biomarker for the diagnosis of cancer and its prediction of disease progression is being extensively investigated.

BORIS, a newly discovered member of the cancer-testis antigen family, is a paralogue of the transcription factor CTCF [74, 75] (Fig. 4). The discovery of BORIS in a significant number of individuals diagnosed with breast cancer suggests that BORIS could have valuable real-world uses as a molecular biomarker for breast cancer. Additionally, BORIS plays a role in controlling cancer-testis genes: when BORIS is expressed in healthy cells, it causes the activation of cancer-testis genes such as MAGE-A1, NY-ESO-1, and other related genes [76, 77]. BORIS plays a crucial part in arranging the structure of chromatin as it brings in H3K4 methyltransferase to enhance the expression of MYC and BRCA1 genes. The discovery of BORIS revealed that it hinders the expression of EIF3E, RSPO2, PTPRK, RSPO3, TADA2A, and CD4 [78, 79].

Fig. 4
figure 4

A model for CTCF-BORIS functions in normal cells and cancer cells

There is currently a heightened focus on the discovery of biomarkers and their clinical applications, spurred by the Human Genome Project completion and advancements in proteomics. In the realm of cancer, and specifically in the case of lung cancer, biomarker discovery holds paramount importance for early detection, personalized therapy recommendations, and continuous prognosis monitoring, primarily due to its elevated mortality rates [80]. An exciting development in this pursuit is the identification of the kallikrein B1 (KLKB1) fragment, which bolsters the recent hypothesis that serum peptidomes can serve as valuable biomarkers for cancer diagnosis [81]. This discovery of a low molecular weight protein fragment hints at the existence of potentially more potent lung cancer biomarkers that can be derived from cancer-specific enzymatic breakdowns, opening promising avenues in the field of early cancer detection [82]. DNA-based biomarkers have potential for lung cancer but lack adequate sensitivity, specificity, and reproducibility. NSE is promising as a post-therapy monitoring tool for SCLC. EGFR mutation serves as a novel biomarker to diagnose EGFR mutation-induced NSCLC and predict response to EGFR-targeted protein tyrosine kinase inhibitors [83]. A biomarker is a measurable characteristic of a biological system that can be used to assess the presence or progression of a disease or to predict the risk of developing a disease. Biomarkers can be found in blood, urine, saliva, or other bodily fluids, or tissues. (Biomarkers in Risk Assessment: Validity and Validation, Environmental Health Criteria Series, No222, WHO).

CTCF plays a crucial role in controlling gene expression in vertebrates. This transcription regulator is capable of regulating a variety of genes, and it is involved in both epigenetics and a wide range of diseases. Initially discovered based on its ability to bind to different regulatory sequences near the promoters of MYC oncogenes in chickens, mice, and humans, CTCF is a nuclear protein that is widely expressed and has a DNA-binding domain consisting of 11 zinc fingers [84]. CTCF is vital for proper cellular function, as demonstrated by studies indicating its essential nature. It is remarkably conserved across species, ranging from fruit flies to mice and humans. In the realm of vertebrates, CTCF is the primary protein involved in the formation of insulators, which are crucial for gene regulation [85]. These insulators have key roles in processes such as gene imprinting, where certain genes are expressed in a monoallelic manner [86, 87], and in X chromosome inactivation and escape from X-linked inactivation [88, 89]. CTCF remains the prominent protein implicated in these diverse regulatory functions. Super-enhancers have essential functions in regulating genes specific to certain cell types and influencing the progression of human diseases. CTCF, a protein that suppresses gene expression and acts as a barrier between genes, is often found in the regions that separate or are contained within super-enhancers and is involved in interactions within the chromatin structure [90, 91]. In 2022, Zhang and colleagues introduced CLNN-loop, a groundbreaking deep-learning model designed to forecast chromatin loops across various cell lines, as well as pair types of CTCF-binding sites (CBS). This innovative model amalgamates multiple sequence-based features to enhance predictive accuracy. Building upon this progress, in 2023, Xu and his team unveiled Deep Anchor, a sophisticated deep-learning model engineered to precisely delineate binding patterns for diverse CBS types. This model employs a pioneering architectural approach that synergizes sequence features with spatial attributes, resulting in cutting-edge performance levels [92].

Poly(ADP-ribosyl)ation plays a pivotal role in DNA repair and apoptosis regulation. Specifically, CTCF, a chromatin insulator protein, predominantly engages with the maternal H19 imprinting control region (ICR) allele. Recent experiments delved into the intricate relationship between poly(ADP-ribosyl)ated proteins and the H19 ICR, focusing on CTCF target sites bearing specific point mutations. These investigations unveiled a compelling revelation: the H19 ICR exhibited a discernible poly(ADP-ribosyl)ation mark exclusively when the wild-type allele was inherited from the mother. This underscores the critical reliance of this process on functional CTCF target sites [93]. CTCF, a versatile protein that interacts with enhancer RNAs and genomic DNA, demonstrates heightened binding to enhancer RNAs when stimulated by estrogen in breast cancer cells [94]. Notably, the TET-catalyzed 5-methylcytosine derivative, 5-carboxyl cytosine (5caC), emerges as a potential driver of novel CTCF binding within genomic DNA [95]. Intriguingly, comparisons with chromatin immunoprecipitation-sequencing data unveil instances where genomic cohesin and CTCF enrichment sites remain unoccupied in individual cells at any given time, suggesting that cohesin and CTCF might not always colocalize at identical genomic sites [96].

Utilizing CRISPR DNA-fragment editing in conjunction with chromosome conformation capture techniques, recent studies have illuminated the remarkable capacity of CTCF sites to function as enhancer-blocking insulators. These sites achieve this by forming distinct directional chromatin loops, regardless of whether enhancers themselves contain CTCF sites. This revelation highlights the multifaceted and dynamic nature of CTCF-mediated genomic regulation [96]. The distinct and collective roles of CTCF and CTCFL in arranging chromosomes and controlling gene activity have significant implications for comprehending how their simultaneous expression disrupts gene regulation in cancer. Mainly characterized by delayed development and intellectual disability, the characteristics resulting from a lack of CTCF and cohesin emphasize the crucial role of architectural proteins, especially in neurodevelopmental processes.

Therapeutic targeting of CTCF in cancer

Previous discoveries have shown that certain DNA sequences containing CpG dinucleotides and bound by CTCF can be regulated through methylation in response to various biological or environmental signals [10], observations reveal that the methylation state of CpG dinucleotides present in the CTCF-binding sites of promoter IV can be influenced by the levels of NAD inside cortical neurons. Studies show that the binding of CTCF at the BDNF gene region plays a crucial role in the normal transcription of BDNF, which aligns with other research suggesting that the loss of CTCF leads to the formation of an inactive chromatin structure, disruption of long-range chromatin interactions, and transcriptional suppression [97, 98]. Gastric cancer (GC) ranks third in terms of cancer-related fatalities globally. The inhibition of PD-L1 or CTCF in GC cells effectively hindered drug resistance caused by GCMSC, leading to a reduction in cell stemness. The findings of the study reveal a mechanism in which GCMSC-CM fosters chemoresistance in GC by increasing the expression of CTCF-PD-L1. These results provide compelling evidence for targeting the CTCF-PD-L1 signaling pathway as a means to combat resistance in clinical settings [99]. Nup93 and CTCF have distinct roles in controlling the expression of the HOXA gene locus in different regions. Nup93 acts as a stable support structure that aids in connecting the chromatin loop that may be created by CTCF. This interaction plays a crucial role in regulating the temporal organization and function of HOXA during the process of cellular differentiation [100].

Hepatocellular carcinoma (HCC) represents a prevalent malignancy and ranks as the third leading cause of cancer-related mortality worldwide [101]. CTCF, a pivotal protein, has been consistently detected in HCC biopsies, HCC cell lines, and a liver adenocarcinoma cell line while remaining undetectable in normal liver tissues. Notably, CTCF plays a significant role in fueling HCC cell growth and metastasis by directly regulating the expression of critical genes such as telomerase reverse transcriptase (TERT) and forkhead box protein M1 (FOXM1) [102]. Emerging research indicates that certain CTCF-binding sites in proximity to enhancers act as pivotal anchors, orchestrating the recruitment of co-activators to facilitate gene transcription—a potential mechanism underlying CTCF’s function in HCC, distinct from its classical role in forming insulated neighbourhood loops [58]. Interestingly, while the inhibition of FoxM1 represses genes associated with epithelial-mesenchymal transition (EMT), CTCF depletion exerts no discernible impact on the expression of these genes. Terminal deoxynucleotidyl transferase nick-end labelling (TUNEL) staining results did not reveal significant differences either. Instead of directly influencing the expression of EMT-related genes, CTCF likely regulates the motility and invasiveness of HCC cells by modulating genes involved in the organization of actin stress fibres [103]. Furthermore, CTCF activation of ABCG2, a gene implicated in clinical multidrug resistance (MDR) in colorectal cancer (CRC), adds another layer to its multifaceted role [103]. Gene set enrichment analysis (GSEA) has uncovered a notable association between CTCF expression and the enrichment of Hedgehog pathway-related gene set signatures. Overexpression of CTCF correlates with elevated levels of GLI1, Shh, PTCH1, and PTCH2, while silencing CTCF leads to decreased expression of these genes. Studies suggest that P53 acts as an inhibitor of the Hedgehog signaling pathway, whereas CTCF suppresses P53 expression. Intriguingly, nuclear extract assays have revealed that the repression of P53 enhances GLI1’s nuclear accumulation induced by CTCF [104]. In a different context, the dCas9-DNMT3A epigenetic modulator system can disrupt the binding of HIF1α or CTCF under hypoxic conditions. This disruption, in turn, interrupts the HIF1α-CTCF-COL5A1exon64A axis, ultimately mitigating the potential for epithelial-mesenchymal transition (EMT) in breast cancer cells [105].

CTCF is an important player in the epigenetic regulation of cancer, but there are still many challenges to understanding its role in the disease. One challenge is that CTCF is involved in a wide range of cellular processes, which makes it difficult to determine its specific contributions to cancer. Another challenge is that CTCF binding is often cell-type specific, making it difficult to generalize findings from one cell type to another. Despite these challenges, there has been remarkable progress in understanding the role of CTCF in cancer. For example, it is now known that CTCF mutations can lead to cancer and that CTCF can be targeted by cancer therapies. In addition, CTCF has been demonstrated to be involved in numerous cancer-associated mechanisms, including DNA methylation, histone modification, and transcription. Future research on CTCF in cancer is likely to focus on several key areas. First, researchers will need to further investigate the specific mechanisms by which CTCF regulates cancer. Second, researchers will need to develop more effective ways to target CTCF in cancer therapies. Finally, researchers will need to understand how CTCF interacts with other epigenetic factors in cancer [62, 106, 107].

Conclusion

CTCF is an essential epigenetic regulator in cancer and functions as a critical mediator of three-dimensional chromatin architecture, creating boundaries that prevent the spread of epigenetic marks and the inappropriate interaction of enhancers with promoters. By governing these higher-order chromatin structures, it influences the formation of topologically associated domains (TADs), which are crucial for proper gene regulation and the prevention of oncogenic alterations. Gilbert et al. [24] highlights the impact of CTCF’s involvement in chromatin organization. The study shows how mutations or epigenetic changes in CTCF binding sites can lead to aberrant gene expression patterns, potentially contributing to cancer development, particularly in gliomas. Furthermore, it delves into the various roles of CTCF in genome regulation, emphasizing its importance in maintaining proper gene expression and genome stability. The review article discusses how CTCF’s insulator and enhancer-blocking functions are crucial for preventing the activation of oncogenes or silencing tumor suppressor genes, thereby influencing cancer development. By dissecting the roles and mechanisms of CTCF, we gain valuable insights into the epigenetic alterations that drive cancer, potentially opening new avenues for targeted therapies and personalized treatment strategies. BORIS, a gene expressed in multiple cancers, qualifies as a cancer-testis gene, underscoring its involvement in cancer-related processes. Intriguingly, conditional expression of BORIS in normal fibroblasts selectively activates cancer-testis genes [75]. This phenomenon was substantiated by quantitative reverse transcription-PCR analysis, which unveiled a robust and synchronized induction of BORIS and NY-ESO-1 expression exclusively in lung cancer cells, not observed in normal human bronchial epithelial cells. These observations occurred in response to clinically relevant conditions involving exposure to agents such as 5-aza-2’-deoxycytidine (5-azadC), Depsipeptide FK228 (DP), or sequential 5-azadC/DP treatment [75]. Furthermore, the presence of CTCF expression in breast cancer cells has been correlated with resistance to apoptosis. This connection raises the possibility of therapeutic strategies centered on selectively reducing CTCF levels in breast cancer cells [108]. In another context, cohesin has been found to function as a tissue-specific transcriptional regulator independently of CTCF binding, manifesting differently in various cell types. Notably, SFN, a compound, significantly inhibits the viability and proliferation of breast cancer cells in vitro while sparing normal breast cells from notable effects [109]. The significant role of CTCFL/BORIS as an Epi-driver gene in endometrial cancer suggests potential implications for future vaccine development. Moreover, CTCF hemizygous knockout mice have shown heightened susceptibility to spontaneous chemically induced cancer across a wide spectrum of tissues. The presence of frequent mutations at CTCF binding sites (CBSs) in cancers characterized by a mutational signature dominated by AT base pair mutations further underscores the intricate role of CTCF in tumorigenesis [110, 111, 112]. CTCF’s involvement in a myriad of cellular processes, including genomic imprinting, chromatin architecture, histone modifications, and gene transcription, underscores its significance in cancer development and progression. By organizing the genome into distinct chromatin domains and modulating gene expression, CTCF maintains genomic integrity while influencing cancer-related pathways. Moreover, the dysregulation of CTCF in cancer has implications for clinical practice, as it holds promise as a diagnostic biomarker and therapeutic target in various cancer types. Its tumor-suppressive function, through the regulation of key transcriptional regulators, further emphasizes its potential as a personalized and precision-based therapeutic approach for cancer treatment. A deeper understanding of CTCF’s role in epigenetic control opens up avenues for novel targeted therapeutics and advances in cancer management.