Keywords

1 Introduction

Many unusual features in the transcriptional regulation were demonstrated in this early branching protozoan parasite Entamoeba including the following: (1) an atypical RNA polymerase present in Entamoeba that is resistant to alpha-amanitin [1]; (2) during mitosis chromatin doesn’t condense; (3) histone H3 and H4 comprise a variable N-terminal tail, and the TATA box that is present upstream of the (Inr) initiator region is unusual [2,3,4,5]; (4) very short untranslated regions (UTRs) [3, 6,7,8]; (5) a GAAC element (AATGAACT) or GAAC-like element comprises different locations in the main promoter [6, 8,9,10]; and (6) an Inr element (AAAAATTCA) present adjacent the transcription start position [6, 10]. Furthermore, the putative TATA-binding protein in E. histolytica (EhTBP) shows notable sequence divergence from the TATA-binding protein of other higher eukaryotes [11]. Conjointly, it comes out that the transcriptional regulation in Entamoeba is controlled by unusual mechanisms. The core promoters of Entamoeba consist of three elements: (a) putative TATA element (GTATTTAAA) at approximately 30 nt upstream of the transcription start site, (b) GAAC element (AATGAACT) with different locations in the core promoter, and (c) an Inr element (AAAAATTCA) adjacent the transcription start site [6, 10, 12]. A GAAC-like motif (EiCPM-GL) (GAACTACAAA) that shows high similarity with GAAC element has also been identified in E. invadens which create similar diarrheal disease in reptiles.

The unavailability of genetic manipulations in Entamoeba was the main hurdle to identify and characterize the transcription factors in this parasite for a long time. Once Nickel and Tannich developed the transfection protocols to introduce plasmid DNA into Entamoeba; it gives a new dimension to study and characterize the transcription factors [13,14,15]. The development of several Entamoeba vectors with reporter genes was helpful for the characterization of several Entamoeba cis-regulatory elements and core promoter. Additionally, a putative TATA box was identified in this parasite which is 30 nt upstream from the transcription start site along with an Inr element (adjacent to the transcription initiation site) and a GAAC element [6]. In protozoan system, this GAAC element is unique and is capable of controlling the transcription initiation independent of either the TATA box or the Inr element [16]. Further, in silico analysis of gene promoter along with biochemical approaches identified few more TFs (e.g., EhCudA, HRM-BP, ERM-BP) in this parasite characterized through deletion and replacement analysis.

1.1 Entamoeba Genome Sequencing and Transcriptomic Data Improves the Field

E. histolytica genome sequencing was an important advancement in understanding the Entamoeba biology, and further refinement of Entamoeba genomic features was achieved by reassembly of the genome [17, 18]. The whole genome size of Entamoeba is predicted to be ~20 Mbp comprising the following characteristics relevant to gene structure and transcription: (1) ~ 8200 gene codes for protein along with a median gene length of 1260 bp, (2) small number of intron (~ 24% genes carry introns), (3) a unique RNA polymerase II having several especial features comprising a highly variant α-amanitin-binding region, which explains why this organism shows resistance to this drug, (4) TATA-binding proteins encoded by three genes in Entamoeba [19, 20], (5) Myb domain containing proteins in Entamoeba which are greatly expanded [21], (6) histone acetyl transferases and histone deacetylases (these two histone-modifying proteins are identified in this parasite) [22], (7) demethylase domains containing protein not identified in this parasite, and (8) presence of one DNA methyltransferase (cytosine-5) protein in Entamoeba [23].

Entamoeba genome sequencing and genome annotation provide an inauguration platform for many studies in this parasite. For an example 32 Myb domain-containing proteins were identified by comparative in silico analysis and are further classified into three families [21]. Family I consists of two Myb domains and structurally resembles the plant Myb domain proteins. Moreover, the individual domains of Entamoeba Myb share closest homology with human c-Myb. On the other hand, families II and III both comprise a single Myb domain.

Despite the effort from many groups, there are no methods developed yet to study the encystation in E. histolytica, and E. invadens which is a closely related parasite in reptiles has been developed as a model for the study of stage conversion in this parasite.

The genome sequence of both E. histolytica and E. invadens is extremely repetitive, and it appears that only 50% of the genome size is accounted for genic and intergenic sequence. The genome of E. invadens accounted for 11,549 predicted genes compared to 8306 in E. histolytica. The genome analysis in Entamoeba showed that the length of the genes of E. histolytica and E. invadens is very similar; however, the intergenic regions in E. histolytica tend to be shorter compared to E. invadens.

In E. invadens out of 11,549 predicted genes, 9865 showed a BLASTP (E-value <10−5) hit to 7216 genes (out of 8306 predicted genes in E. histolytica), and among those 5227 are putative orthologs. Alignment of orthologs showed an average amino acid identity around 69%, indicating that these species are distantly related. Of the E. invadens genes which do not have orthologs in E. histolytica, 77% (4815/6218) have at least considerate RNA-seq support, compared to 98% (5206/5331) of genes that shared with E. histolytica [24]. This result indicates that a fraction of these genes may be false positive predictions; additionally this is also consistent that many of these genes are not constitutively expressed.

To point out the conserveness between the genes in these two Entamoeba species, all collinear gene pairs that were adjacent in both E. histolytica and E. invadens were analyzed. This analysis showed only 561 genes that preserved their neighboring gene in both species (out of 5227 total genes). Hence it is quite clear that there has been extensive rearrangement in the genome between these two species and most of the biological processes are also conserved.

1.2 Gene Expression Profiling and Transcriptional Regulation

Expression profiling with advanced technologies like microarray and RNA-seq has revolutionized the research field in understanding transcriptional regulation on a genome-wide scale [24]. These approaches have been significantly used in studying Entamoeba gene expression in different conditions as well as throughout the different stages of development [24]. Different types of microarray platforms in Entamoeba have been developed such as that generated from genomic DNA, short oligonucleotides and long oligonucleotides. These tools help in identifying transcript abundance in different developmental time points as well as during stress and host invasion. Moreover, many factors that are responsible for virulence and pathogenicity have been identified by comparative gene expression studies, genes that are upregulated in virulent strain but downregulated in non-virulent Entamoeba. All together, these findings have provided insights on molecular aspects of important amebic biology, e.g., stage conversion and pathogenic potential, and allow researcher with the first intuition to identify prospective novel drug targets against amebic disease.

2 Transcription Factors in Cellular Function

The fundamental step in gene expression is “transcription” process, where an mRNA is synthesized from a DNA template, followed by the second step “translation” that strings the amino acid together to make protein. Developmental studies have shown the upregulation and downregulation of sets of transcript level during the different stages of development as well as at different growth conditions. The transcription process is controlled by the orchestrated function of several proteins, e.g., a protein can bind DNA, and this DNA-binding proteins may involve in regulation in gene expression. Most of our knowledge on the basic elements of transcription regulation is achieved from early work on prokaryotic systems, where genes are arranged in sets of contiguous genes that comprise regulatory sequences and structural genes. A classic example is the lactose (lac) operon of E. coli. The transcription in eukaryotes is much more complex than in prokaryotes. First, the prokaryotes utilize only one RNA polymerase; however, in eukaryotes there are three different RNA polymerases: I, II, and III. Second, the eukaryotic RNA polymerases require additional proteins called general transcription factors (TFs) to position them at the correct start site. However, during transcription in prokaryotes, RNA polymerases also require accessory polypeptides called sigma factors (σ), which are considered as a subunit of the RNA polymerase. On the other hand, a large, multi-subunit transcription initiation complex is formed in eukaryotic transcription initiation. For example, RNA polymerase II requires a multi-subunit complex of seven general transcription factors to constitute the initiation complex, and each of the subunits must be added in an orchestrated way.

Transcription factors normally have three structural features: a domain that binds to DNA, a transcription-activating domain, and a domain that binds to a ligand. The DNA-binding domain binds to a specific DNA sequence through the formation of hydrogen, ionic, and hydrophobic bonds, although the particular combination and spatial distribution of such interactions are distinctive for each sequence. In silico analysis of many DNA-binding proteins guided the identification of a number of highly conserved DNA-binding structural motifs; these are (1) HTH (helix-turn-helix) motif, (2) ZnF (zinc finger) motif, (3) HLH (helix-loop-helix) motif, (4) leucine zipper motif, and (5) basic zipper motif.

Cellular responses consist of a cascade of events both in prokaryotes and in eukaryotes which involves many intracellular signaling pathways (e.g., PKA, MAPks, JAKs, PKCs) that control the fine tuning of gene regulation by many transcription factors. Transcription factors in bacteria are generally classified by comparison of amino acid sequence with prototypic members of families of DNA-binding proteins, such as LysR-like and AraC-like protein families. TFs are often classified based on the structural motifs that constitute their binding domains, for example, TBP (TATA-binding protein), TBP-associated factors (TAF), and recently identified p300/CBP coactivator family. There are several families of TFs that exist, and each of which shows structural and functional features. The examples of such families are helix-turn-helix (e.g., Oct1), helix-loop-helix (e.g., E2A), zinc finger (e.g., GATA proteins, TFIIIA), leucine zipper motif (cAMP, CREB, AP-1), and beta-sheet motif (e.g., nuclear factor-kB) [25].

In eukaryotes, there is a class of transcription factors called GTFs (general transcription factors) involved in basal transcription regulation which includes TFIIA, TFIIB, TFIID, TFIIE, and THIIH. Jump-start of different transcription factors, for example p53, NF-κB (nuclear factor-κB), AP-1 (activated protein-1), Nrf2 (nuclear erythroid-derived 2-related factor 2), and CREB (cAMP-responsive element-binding) protein associate with various cellular function like p53 and NF-kB are involved in cellular damage response. NF-κB family play critical roles in immunity, inflammation, differentiation, cell proliferation, and survival [26]. AP-2 family transcription factors are evolutionarily conserved that bind to the DNA consensus sequence GCCNNNGGC and upregulate target gene expression. In mammals, four different isoforms of AP-2 have been identified, termed AP-2 α, β, γ, and δ. Studies have identified the role of AP-2 TF in Plasmodium ApiAP2 transcription factor (PfAP2-EXP2) – controlling the gene expression in the intraerythrocytic developmental cycle of plasmodium parasite. AP-1 on the other hand participates in control of proliferation, senescence, differentiation, and apoptosis [27]. Sp1 is a member of transcription factors which include Sp2, Sp3, and Sp4 playing a role in DNA repair. CREB is a phosphorylation-dependent nuclear transcription factor that is involved in different important cellular functions including apoptosis and cell proliferation. The cAMP-CRP protein is considered as lying between the conventional transcription regulators and histone-like proteins, and it can bind specifically to a consensus DNA sequence. Another TF is FOXO3a protein, a fork-head transcription factor that is a member of FOXO subfamily and mediates a variety of cellular process including proliferation, cell cycle progression, DNA damage, and apoptosis [28]. The next important TF is E2F that is activated by E1A protein that is a viral oncoprotein and needed for adenovirus gene expression. E2F transcription factors are recognized as key players in controlling the cell cycle, transformation, and differentiation, and it has been found that the E2F/pRB pathway acts as a key regulator on cell cycle and development. Quite a few important TFs and DNA motifs have been characterized in protozoan parasites. For instance, a member of the HMGB was identified in Entamoeba histolytica, and some Myb family members were characterized from Trichomonas vaginalis, and a cell cycle-dependent ApiAP2 transcription factor, TgAP2IX-5, was found in Toxoplasma gondii [29]. The list of transcription factors identified so far in Entamoeba is shown in Table 1, and their functions are depicted in the schematic in Fig. 1. However, the biological role of many TFs in this parasite is still poorly understood, and further characterization is needed for better understanding.

Table 1 Entamoeba transcription factors, their representative DNA-binding motifs, and information gained
Fig. 1
A flow diagram includes the following. 1. Entamoeba. 2. Nucleus. 3. Promoter with transcription factors. 4. Target gene, transcription. 5. m R N A, translation. 6. Protein. 7. Pathogenicity. 8. Development and stress response. 9. Cellular functions.

Transcription factors and their roles in Entamoeba

2.1 TATA-Box-Binding Protein

In the past two decades, important improvements have been achieved in terms of molecular biology techniques that expanded our perception of transcriptional regulation in E. histolytica. Several groups have identified a number of TFs and the core promoter region in Entamoeba. However, very little is known about the transcription machinery and especially transcription regulation during the development of this parasite.

In the late 1990s or early 2000s, the approaches used by different groups to identify the transcription factors in this parasite were mainly based on comparative amino acid sequence analysis of known transcription factors, present in other systems [30], or yeast one-hybrid assay [31] or deletion or replacement analysis of consensus motifs in the promoter region [31, 32]. Among the earlier approaches, comparative analysis of amino acid sequence TATA-box-binding protein from Acanthamoeba castellanii identified the Entamoeba transcription factor as TATA-box-binding protein (EhTBP) [30]. The EhTBP is more unselective compared to higher eukaryotes and binds a wide variety of E. histolytica TATA-box sequence [11, 19]. Later on, genome sequencing, gene expression profiling, and proteomics approaches advanced to study the transcriptional networks and help identify novel transcription factors [12, 20, 21, 33,34,35,36]. Subsequently, the sequencing and annotation of Entamoeba genome provide identification of two more amoebic TATA-binding proteins (TBP) [20]. TBP and TRF1 transcription factors in E. histolytica are GAAC-box-binding proteins that represent distinctive expression of genes under stress response and during the interaction of Entamoeba with mammalian cells. However, the biological role of these two new TBP is yet to be determined.

2.2 EhCudA

The transcription factor EhCudA was identified by a comparative in silico approach by utilizing Dictyostelium CudA as a query [34]. In Dictyostelium this protein is necessary for pre-spore-specific gene expression and has significant homology in Entamoeba protein [34]. Yamada et al. expressed CudA protein in bacteria and used recombinant protein and were able to identify the DNA-binding motif AGAATTTTCT which shows specific interaction with CudA in vitro; however the functional characterization of Entamoeba CudA is yet to be determined [34].

2.3 EhEBP1 and EhEBP2

Two enhancer-binding proteins (EhEBP1 and EhEBP2) which specifically bind to the URE4-binding domain were discovered by using nuclear extracts from amoeba and DNA affinity chromatography followed by mass spectrometry [31]. Some unique features were reported in these TFs; both EhEBP1 and EhEBP2 comprise an RNA recognition motif RRM; however; no recognizable DNA-binding domain was identified.

2.4 EhPC4

Analysis of genome-wide microarray data from virulent trophozoites isolated from hamster liver abscesses identified a transcription factor, EhPC4 (E. histolytica-positive cofactor), which significantly upregulated during the infection [37]. The author has reported the potential role of EhPC4 in liver abscess formation by controlling the expression of vital genes involved in cytoskeleton dynamics, cell migration, and invasion [37]. The transcription factor EhPC4 also possesses important role in regulating DNA replication and genome stability [38].

2.5 Ehp53

A p53-like E. histolytica protein (Ehp53) was identified which binds to the human p53-binding consensus DNA sequence confirmed by human p53 antibodies [39]. It has been reported that monoclonal antibody against human p53 protein could recognize the recombinant Entamoeba Ehp53 suggesting that in Entamoeba this protein may be evolutionarily conserved [39]. In mammalian cells, p53 takes part in several cellular processes like cell cycle regulation, DNA repair, precluding uncontrolled cellular division, and apoptosis; however the functional characterization of this TF is yet to be determined.

2.6 HMGB1

The TF HMGB1 (high mobility group box protein 1) was identified by the analysis of genome-wide transcriptome data during Entamoeba colonization and invasion to the intestine [40]. HMGB proteins can bind a diverse sequence of DNA in a conformation-dependent way which includes stem-loops, palindromes, four-way junctions, B-Z junctions, and even single-stranded or cruciform DNA [41]. This protein contains one or more units of the HMG box DNA-binding motif, and it is observed that it can increase DNA binding in a sequence-specific manner. This protein is involved in many important cellular functions, e.g., transcription, recombination, and repair. In a recent study, it was shown that Entamoeba when in contact with macrophage induced the secretion of HMGB1 which functions as a pro-inflammatory cytokine and can also act as a chemoattractant during the Entamoeba infection [42].

3 TF in Development and Stress Response

Development needs the conscientious orchestration of many biological episodes in order to generate an entire multicellular organism, and in case of unicellular organisms, this orchestration is equally important throughout the different stages of development. Many transcription factors (TFs) involved in the development are conserved evolutionarily from yeast to humans. For example, there are four TF families that play a determining role and have been characterized immensely both during the development of embryo and in cancer. These are (1) GATA, (2) the high mobility group box (HMG), (3) paired box (PAX), and (4) basic helix-loop-helix (bHLH) [43,44,45]. Living organisms constantly face diverse types of physiological and environmental stress. To survive with the detrimental consequences of stress or to protect against further exposure to the same or other forms of stress, cells have evolved rapid molecular responses to repair the damage. TF plays an important role by upregulation or downregulation of set of genes which makes the organism more resistant in the adverse condition, and many stress-controlled transcription factors have been discovered and characterized in different systems [46, 47].

Transcription factor activation is a complicated process that may involve numerous signal transduction pathways, including several kinases, e.g., PKA, MAPKs, JAKs, and PKCs, which are activated by cell-surface receptors [48]. Major TF families, together with WRKY, MYB, NAC, and AP2/ERF, are important regulators of diverse genes associated with various stressors. WRKY, as one of the most well-studied plant TFs, regulates a wide range of developmental, physiological, and metabolic activities. The WRKY family has been recognized as a major group of transcription factors in many plant species. These function as activators, repressors, or corepressors of essential pathways, such as the generation of alkaloids, terpenes, and other specialized metabolites, and have been proven to be significant in the activation of diverse immune response pathways, making them important in biotic stress. WRKY transcription factors were found to be useful in relieving infection stress produced by biotic or abiotic agents via self-regulation or hormone-mediated signal transduction pathways. In Entamoeba, a few transcription factors were identified which control the expression of important genes pertinent to several important facets of Entamoeba biology which includes stage conversion and oxidative stress (Table 1 and Fig. 1) [35, 49].

3.1 HRM-BP

A novel H2O2 stress-responsive motif HRM was identified by in silico analysis of the promoter sequences of genes that are upregulated in H2O2 stress, and the transcription factor HRM-binding protein (HRM-BP) was identified by biochemical analysis and mass spectrometry [35]. The interaction of HRM-BP with the HRM motif is very specific, and alteration of HRM-BP expression either by silencing or by overexpression in Entamoeba showed changes in the basal expression of stress responsiveness or H2O2-responsive genes [35].

3.2 EhMyb and EhMyb-dr

A set of 32 Myb domain-containing proteins were identified in Eh by analyzing the c-Myb protein sequences as the query from human [21]. Electrophoretic mobility shift assays (EMSA) by using recombinant Entamoeba Myb10 (family I) showed that this Myb10 protein could bind the canonical Myb-binding motif (TAACGG) as reported in other eukaryotes and in Entamoeba, EhHSP70 gene promoter comprises a Myb DNA-binding motif suggesting its important role in heat shock gene expression in stress response.

Studies in recent years evidenced that transcriptional control has an important contribution in stage conversion in Entamoeba, and three transcription factors were identified. Myb transcription factor (EhMyb-dr) is a SHAQKY family Myb gene which binds to a hexa-nucleotide motif CCCCCC; upregulating this protein in E. histolytica trophozoites results a transcriptional profile that highly resembles with the transcriptome profile of amoebic encystation [50]. The interplay between the EhMyb-dr protein and the DNA sequence is eventually confirmed by EMSA as well as by chromatin immunoprecipitation (ChIP) analysis, and it is evident that EhMyb-dr regulates a set of cyst-specific genes [50].

3.3 ERM-BP

An encystation regulatory motif (ERM) that is a hepta-nucleotide sequence was (CAACAAA) identified in the promoter of 131 cyst-specific genes in E. invadens which is used as model system for developmental studies. Electrophoretic mobility shift assay showed specific binding of Entamoeba cyst protein only, not by the trophozoite protein suggesting that the protein bind to ERM may be specifically expressed in cyst only. ERM-binding protein (ERM-BP) was identified by electrophoretic mobility shift assay followed by mass spectrometry. Metabolic cofactor NAD+ positively regulates the binding of recombinant ERM-BP with ERM, and downregulation of ERM-BP significantly decreased encystation efficiency, and ghost-like abnormal cysts with defective cyst wall are produced suggesting that ERM-BP plays an important role in encystation [51]. The ERM-BP is conserved among other Entamoeba species, and upregulating ERM-BP in E. histolytica (EHI_146360) produced quadri-nucleate cyst-like structures and makes the parasite more resistant due to heat stress, supporting the idea that heat stress response and encystation might have a potential overlap and some interconnection and share common signaling pathways [52, 53].

3.4 NF-Y (Nuclear Factor Complex)

Nuclear factor complex (NF-Y) is made up of three subunits, namely, NF-YA, NF-YB, and NF-YC, that very specifically bind to a pentanucleotide motif CCAAT and this TF complex conserved throughout evolution [54]. NF-Y plays crucial roles in higher eukaryotes, controlling many cellular processes (e.g., cell cycle regulation, development, response to growth, stress, DNA damage, and apoptosis) by regulating the expression of genes that comprise CCAAT promoter motif [54].

In E. invadens the expression of NF-YA is constitutive; however NF-YB and NF-YC are expressed during encystation. Silencing of the NF-YC subunit in Entamoeba showed significant reduction in DNA-binding ability of the NF-Y complex and also reduced encystation efficiency [54].

4 Transcription Factors in Pathogenicity, Virulence, Drug Resistance, and Phagocytosis

Transcription factors (TFs) are central components which play a critical role in the gene expression. A little change in the TF expression and specificity can alter the entire gene expression. During the infection, pathogenic organisms upregulate or downregulate many genes those are downregulated by their TFs which helps in the adaptation of host or tissue specific environment and adaptation of various physiological changes and in the activation of virulence and pathogenicity. The main aim of the identification of TFs is to block the virulence factors in any pathogenic organism. For developing in-depth knowledge about host-pathogen interaction, it is necessary to identify the interplay of signal exchange mechanism which will be helpful to identify the virulence factor and outcome of the infection. Very little information was known regarding the transcriptional switch that helps cell to adjust in response to immune signals and infection. In 2016 Gray et al., identified Fcγ receptor that helps TFEB transcription factor to enhance lysosome-based degradation and killing bacteria [55]. So, it is uncovered thereafter that IgG immune complexes instruct macrophages to transform it as super killers by the upregulated activation of the lysosomes through a transcriptional circuit. It is evident that in Entamoeba pathogenesis, virulence and development are controlled transcriptionally.

4.1 URE-3BP

The upstream regulatory element DNA sequence motif TATTCTATT (URE3) was first discovered in the promoter region of the heavy chain subunit of the lectin gene hgl5 in E. histolytica and later on also found in the promoter of ferredoxin (fdx) 1 gene [6, 40, 49, 56, 57]. Upstream regulatory element-binding protein (URE3-BP) was identified through a yeast one-hybrid screen by using URE3 as bait [30]. It was reported that the promoter activity increases due to the mutation in URE3 motif in the promoter of hgl5 lectin; on the contrary mutation in URE3 motif in the fdx 1 gene promoter decreases the promoter activity by twofold of the reporter gene activity, suggesting that URE3 can act as either a negative or positive regulator in gene expression [57]. This transcription factor comprises two calcium-binding motifs (EF hands), and URE3-BP detach from URE3 DNA in the presence of higher level of calcium, suggesting that calcium acts as negative regulator [58, 59]. The transcription factor URE3-BP is regulated by calcium and controls the expression of two virulence genes in Entamoeba, the Gal/GalNac lectin and ferredoxin. It also has been reported that upregulation of URE3-BP leads to the changes in the morphology of trophozoite and boosts parasite invasion in different organs like the colon and liver, suggesting that transcription factor URE3-BP plays a salient role in Entamoeba virulence [60].

4.2 EhGATA

The GATA transcription factors are conserved and a part of the DNA-binding domain (ZFBD) family that contains zinc finger and recognizes the consensus DNA sequence (A/T)GATA(A/G). This ZFBD superfamily TF moderates a wide range of cellular functions.

In 2020, Huerta et al. reported the existence of a single gata gene in E. histolytica (Ehgata) by bioinformatic analysis and the GATA domain ensured in 80% similarity to the GATA protein of human [61]. Ehgata codes for a noncanonical EhGATA transcription factor that contains an AT-Hook motif and only one zinc finger DNA-binding domain. Bioinformatic prediction showed the presence of GATA-binding sequence over 1600 gene promoters in Entamoeba genome [61]. Electrophoretic mobility shift assay with the bacterially expressed and purified EhGATA protein, additionally with trophozoite nuclear extracts, showed binding to the consensus GATA-DNA sequence. Moreover, Huerta et al. showed that EhGATA especially binds to the promoters of Ehadh and Ehvps32 genes in vivo and eventually controls EhADH and EhVps32 gene expression in the course of phagocytosis. Additionally, overexpressing of EhGATA in trophozoites showed significant changes in morphology, alteration in cell proliferation, change in adherence efficiency, and change in rate of phagocytosis. These findings suggest that EhGATA TF is capable to bind DNA and fine-tune the expression of several genes those involved in cell proliferation, adhesion to surface, and phagocytosis [61].

4.3 EhHSTF

When bacteria or any other organisms are exposed to a certain drug or antibiotic, they alter their cellular mechanism to survive. Continuous and excessive exposure of any drug can lead to the rise of a drug-resistant population of cells. For decades the main drug of choice against amebiasis is metronidazole, but due to emergence of drug resistance (DR) in most of the pathogens, it is really alarming that DR will cause a major public health problem worldwide. It has been reported that methionine γ-lyase (EhMGL) gene silencing resulted in resistance to trifluoromethionine, revealing a novel mechanism of drug resistance in E. histolytica.

In Entamoeba it has been observed that emetine stress induces the expression of the multidrug resistance EhPgp5 gene [62]. Bello et al. showed that the transcription factor EhHSTF7 recognizes the 5′-GAA-3′ motif into the heat shock element of EhPgp5 gene and is involved in the transcriptional activation of the EhPgp5 gene [62].

5 Summary and Conclusions

The most constructive way to understand the functions of different genes in an organism is by genetic manipulation, and most of the genetic analyses are achieved by alteration in the transcriptome level. In Entamoeba, gene expression and their fine tuning by transcriptional controls are still not well understood, and the majority of genes or proteins are hypothetical. Recent advancement in RNA-seq analysis, proteome analysis, and gene editing by CRISPR/Cas9 opens the avenue to analyze their expression pattern during different stages of development, stress condition, and also the differential gene expression between pathogenic and nonpathogenic strains. For example, during oxidative stress it is reported that 57 genes are upregulated in response to H2O2 exposure and the expression of these genes is controlled by transcription factor HRM-BP. A Myb domain protein EhMyb-dr binds to CCCCCC motif and upregulates a set of genes during encystation, and another transcription factor ERM-BP that binds to CAACAAA motif and 131 cyst-specific genes which were upregulated was identified having this motif. It has been seen that the TFs that bind to cis-regulatory sequence can either positively or negatively regulate the transcription regulation. In Entamoeba TF URE3-BP is reported to regulate the transcription in both ways. URE3-BP positively regulates the expression of lectin heavy chain and negatively regulates ferredoxin 1 gene. However, only a few TFs have been characterized in this parasite till now, and definitely there is an urgency to extend this line of research for better understanding of many unrevealed area of amoebic biology.