Introduction

Geminiviruses are a group of non-enveloped plant viruses with small circular single-stranded DNA (ssDNA) with a size of 2.5-5.2 kb. The capsid is composed of coat protein subunits that form twinned incomplete icosahedral particles, giving rise to the name geminivirus [1]. Geminiviruses infect economically important crops worldwide, causing severe loss to agriculture [2, 3]. They are transmitted by insect vectors such as whiteflies, treehoppers, leafhoppers, and aphids [4]. The family Geminiviridae is divided into 14 genera, namely Becurtovirus, Begomovirus, Capulavirus, Citlodavirus, Curtovirus, Eragrovirus, Grablovirus, Maldovirus, Mastrevirus, Mulcrilevirus, Opunvirus Topilevirus, Topocuvirus, and Turncurtovirus, based on the insect vectors, genome organization, and the host range of their members [4,5,6]. At present, the family Geminiviridae includes more than 500 species [4]. The members of all of the genera of the family Geminiviridae except Begomovirus have a single genomic component, while the genus Begomovirus includes viruses with either a single genomic component, referred to as monopartite viruses, or two genomic components (DNA-A and DNA-B), referred to as bipartite viruses (Fig. 1) [2]. The genome of a monopartite begomovirus is equivalent to DNA-A of a bipartite begomovirus. Begomoviruses containing a monopartite genome are widespread in Old World countries in Africa, Asia, Australia, and Europe, whereas bipartite begomoviruses are mostly found in the New World, with some exceptions [7]. The genomic components DNA-A and DNA-B contain ORFs in a bidirectional orientation. The origin of replication (ori), a stem-loop region containing a nonanucleotide sequence (5’-TAATATTAC-3’), and the bidirectional promoter are located in the intergenic regions (IRs) of DNA-A and DNA-B [8]. The DNA-A component of the genome contains seven ORFs, two in the viral sense and five in the complementary sense. Genes oriented in the complementary sense (anticlockwise) such as AC1, AC2, AC3, AC4, and AC5 are responsible for replication and expression of the other viral genes. The AV1 (coding for the coat protein CP) and AV2 genes are in the viral sense (clockwise). The DNA-B component has two genes: BC1 (in the complementary sense, coding for a viral movement protein) and BV1 (in the viral sense, coding for a nuclear shuttle protein). Here, we review our current understanding of the role geminivirus proteins play in pathogenesis and how their interaction with host factors subverts the host defense, the overview of functions of the ORFs represented as in Fig. 2.

Fig. 1 
figure 1

Genome organization of ssDNA plant viruses belonging to the family Geminiviridae. The yellow circle at the top represents the origin of replication. All the ORFs are labeled and color-coded. AC1/C1, replication-associated protein (Rep); AC2/C2, transcriptional activator protein (TrAP); AC3/C3, replication enhancer protein (REn); AV1/V1, coat protein; AV2/V2, pre-coat protein; BC1, movement protein (MP); BV1, nuclear shuttle protein (NSP). CR, SCR, and SIR refer to the common region, satellite conserved region, and short intergenic region, respectively. Recently, many small ORFs in the monopartite virus TYLCV and βV1 in a betasatellite were identified. ORFs of members of five recently created genera were identified using the ORF Finder programme (https://www.ncbi.nlm.nih.gov/orffinder/) and are appropriately positioned.

Fig. 2
figure 2

Functional summary of viral proteins associated with the geminivirus disease complexes. A pictorial representation of the components of geminivirus disease complexes and the effect of their encoded proteins on the host is shown. The core genomic components of geminivirus (helper virus or DNA-A, DNA-B) and the well-characterized satellite molecules DNA-α and DNA-β are indicated in different colors at the center. The important roles played by each viral protein and the corresponding genetic factors involved in host manipulation are indicated at the periphery with colors corresponding to their genetic origin.

DNA-A

AV1/V1

The AV1/V1 ORF encodes the only structural protein, the coat protein (CP), which is involved in the assembly and packaging of the geminivirus genome [9,10,11]. Although the CP is not required for geminivirus replication, deletion of the CP gene results in a reduction in viral ssDNA accumulation [12, 13]. In some instances, it has also been observed to elevate the expression of dsDNA [13]. The decrease in ssDNA could be related to the role of CP in packaging of ssDNA, with the lack of a CP causing the ssDNA to be exposed to nucleases, leading to its degradation. The increase in dsDNA hints at its role in the conversion of dsDNA to ssDNA [12, 13].

Since the monopartite viruses lack DNA-B, which encodes proteins that facilitate nucleocytoplasmic shuttling and inter- and intracellular movement for systemic spread, the other ORFs complement the movement function of DNA-B in monopartite viruses. The CP serves as a nucleocytoplasmic shuttling protein in the case of mastreviruses [14, 15] and monopartite begomoviruses [16, 17]. Interestingly, the property of nucleocytoplasmic shuttling is conserved in bipartite begomovirus as well [18, 19]. The CP is also involved in the nucleocytoplasmic shuttling of the viral DNA with the help of the nuclear localization signals (NLSs) present in its N-terminal, C-terminal, and central regions as well [10, 15, 20]. Analysis of the CP from tomato leaf curl Java virus (ToLCJV) revealed that arginine-rich stretches of amino acids at positions 16 to 20 (KVRRR) and 52 to 55 (RKPR) in the N-terminal region are essential for NLS activity and that a hydrophobic stretch in the C-terminal region from residues 245 to 250 (LKIRIY) is important for nuclear export signal (NES) activity [21]. Furthermore, CP interacts with the ssDNA and dsDNA forms of the geminivirus genome via its N-terminal domain [22]. The nucleocytoplasmic shuttling of the viral DNA is also facilitated by the interaction of CP with host proteins involved in nucleocytoplasmic trafficking, such as importin α and karyopherin α1 [16, 23]. In addition to the nuclear shuttling of the viral DNA, the CP of geminiviruses also helps in the systemic movement of viral DNA [12, 13, 15]. It is noteworthy that monopartite viruses require CP for systemic movement, while bipartite begomoviruses can exhibit systemic movement in the absence of the CP because this function is carried out by proteins encoded by DNA-B. However, it is dependent on the host in some cases. For example, in the case of tomato golden mosaic virus (TGMV), the virus without CP can still exhibit systemic movement in Nicotiana benthamiana, but not in Nicotiana tabacum or Datura stramonium, which suggests the possible involvement of host factors in the movement of the virus [24]. Studies have shown, however, that the coat proteins of bipartite begomoviruses are not required for cell-cell movement but can enhance the efficiency of cell-cell movement and systemic spread [18, 24].

The CP is required for vector-mediated transmission of geminiviruses, and it is known to interact with proteins of the vectors and their endosymbionts (Table 1). Mutational studies have demonstrated the importance of the CP and identified amino acids at positions 129, 130, 134 as essential for whitefly-mediated transmission and assembly of virion particles in the case of tomato yellow leaf curl Sardinia virus (TYLCSV) [25, 26]. Amino acids at positions 124, 149, and 174 of abutilon mosaic virus (AbMV) are important for whitefly transmission, and mutations at these positions can enhance vector-mediated transmission [27]. Residues in the CP also determine the specificity of the virus for insect vectors, and altering these residues affects host insect specificity. For example, a T147S substitution appears to be responsible for differences in whitefly transmissibility between the Asia-I and Asia-II-1 strains of squash leaf curl China virus (SLCCNV) [28]. Sometimes, viruses manipulate cellular events in the whitefly. For example, the CP of tomato yellow leaf curl virus (TYLCV) induces apoptosis in whiteflies. Interestingly, suppressing apoptosis reduces the viral titer in whiteflies, while its activation increases the viral titer. The mechanism by which this occurs is not known [29].

Table 1 Factors from the vector that interact with the geminiviral coat protein

AV2/V2

The AV2/V2 gene encodes a pre-coat protein (also known as the AV2 protein). The 5’ end of the AV2/V2 ORF overlaps with the 3’ end of the CP ORF. The AV2 protein of many geminiviruses acts as a pathogenicity determinant. In East African cassava mosaic Cameroon virus (EACMCV), the pathogenicity is dependent on a conserved protein kinase C domain present in AV2 [40]. Mutation of the AV2 protein causes a reduction in the accumulation of ssDNA and dsDNA in plants, but not in protoplasts, indicating that mutation of AV2 impairs the movement of the virus. Furthermore, a localization study using an AV2-GFP fusion confirmed the role of the AV2 protein in geminivirus movement [13, 41]. In TYLCV, the V2 protein, together with exportin α, plays a role in the nuclear export of the coat protein (V1) and promotes the redistribution of V1 in the perinuclear region. Plants infected with TYLCV-V2 with a substitution mutation (V2C85S) that abolishes interaction with V1 exhibited delayed and mild symptoms compared to plants infected with wild-type virus [42]. The AV2 protein also inhibits the hypersensitive response (HR) by interacting with the host factor. TYLCV V2 inhibits HR by interacting with and inhibiting the activity of CYP1, a papain-like cysteine protease [43].

AV2 also acts as an RNAi silencing suppressor of both transcriptional gene silencing (TGS) and post-transcriptional gene silencing (PTGS). TYLCV-V2 suppresses PTGS by interacting with SlSGS3 (suppressor of gene silencing 3) from tomato. The Arabidopsis thaliana homolog, the SGS3 protein (AtSGS3), has already been reported to be involved in the RNA-silencing pathway [44, 45], and TYLCV V2 suppresses TGS by interacting with histone deacetylase 6 (HDA6). In this case, TYLCV V2 prevents the recruitment of the DNA methyltransferase MET1 by HDA6, resulting in hypomethylation of the viral DNA and the successful establishment of infection [46]. The V2 protein of the same virus also disrupts the de novo methylation of viral DNA in cajal bodies by interacting with AGO4 [47]. In another example, V2 from cotton leaf curl Multan virus (CLCuMuV) interacts with N. benthamiana AGO4 and suppresses RNA-dependent DNA methylation (RdDM)-mediated TGS [48]. The V2 from tomato yellow leaf curl China virus (TYLCCNV) suppresses silencing by binding to the 21-nt siRNA duplex and the 24-nt single-stranded siRNA [49]. TGS mediated by hypermethylation of DNA in the viral promoter, on the other hand, has been implicated in the ‘recovery’ of the host plant from geminivirus infection. Tomato leaf curl New Delhi virus (ToLCNDV)-AV2 blocks RDR1-mediated host recovery in tobacco plants. However, tomato leaf curl Gujarat virus (ToLCGV)-AV2 does not affect host recovery [50]. It would be interesting to study this recovery phenomenon in more detail.

AC1/C1

AC1/C1 encodes a replication-associated protein (Rep) that is highly conserved among geminiviruses. Rep is an early protein that is produced following entry into the host cell, and it plays an important role in the replication and transcription of the other viral genes. Rep contains three important domains: an N-terminal domain, a central domain, and a C-terminal domain. The N-terminal domain is essential for DNA binding and DNA nicking activity, the central domain, also referred to as the oligomerization domain, is involved in oligomerization of Rep, and the C-terminal domain is involved in ATPase activity and contains the Walker A and Walker B motifs essential for ATPase activity [51, 52].

Role of Rep in replication initiation, cleavage, and ligation

Rep interacts with many host factors, including retinoblastoma-related protein (RBR), a regulator of cell division; proliferating cell nuclear antigen (PCNA), a DNA clamp and processivity factor of DNA polymerase; replication protein A (RPA), a single-stranded nucleic acid binding protein; replication factor C (RFC), a DNA clamp loader; radiation-activated DNA repair proteins (RAD) such as RAD51 and RAD54, which are indispensable for recombination; and minichromosome maintenance protein 2 (MCM2), a component of the pre-replication complex, all of which help in viral replication [53,54,55]. To initiate replication, Rep binds at two sites in the common region (CR) of the viral genome in a sequence-specific manner. One is in the iteron sequences, and another is in the nonanucleotide sequences. In the case of TGMV and squash leaf curl virus (SqLCV), the CR is a 200-nt region at the origin of replication (ori) that contains a 60-nt stem-loop structure that is essential for replication [56]. Iterons are iterative elements in the CR found between the transcription start site and the TATA promoter of AC1. They can be 8-12 nucleotides in length and can vary among different geminiviruses [57]. In the case of TGMV, Rep binds to a 13-bp region containing 5-bp repeat motifs separated by a 3-bp spacer to initiate replication. Interestingly, the repeat at the 3’ end is necessary for replication, while the repeat at the 5’ end might enhance the replication efficiency of geminiviruses [58]. Similar iterons were also identified in the TYLCV genome, and direct interaction with those regions has been demonstrated [59]. The Rep protein also binds to a hairpin region at the ori site to make a nick at the seventh/eighth residue of the conserved nonanucleotide sequence (5’TAATATTAC3’) in the sense strand to initiate replication. After completion of one round of rolling-circle amplification (RCA), Rep cleaves a phosphodiester bond, leading to the ligation of the end of newly formed nascent DNA. The N-terminus of TYLCV Rep catalyzes the cleavage and ligation of the viral DNA in vitro [60]. Site-directed mutation in the oligomerization domain of TGMV resulted in reduced levels of viral replication. Furthermore, a mutation in the oligomerization domain of TGMV Rep also inhibited viral replication by tenfold as compared to the wild type. This indicates that, in addition to the N-terminal DNA binding domain, the central oligomerization domain of Rep might play a role in viral DNA replication [61]. Geminiviruses replicate inside the nucleus, and Rep contains a nuclear localization signal (NLS). Removal of residues 1-120 of the Rep protein of tomato chlorotic mottle virus (TCMV) results in reduced nuclear accumulation of Rep. Similarly, mutation of the N-terminal residues of African cassava mosaic virus (ACMV) Rep significantly reduces its nuclear import. A recent study has suggested that lysine residues K67, K77, and K101 of TYLCV Rep are important for its nuclear localization [62]. These lysine residues have been shown to interact with E2 SUMO-conjugating enzyme 1 (SCE1), which mediates the sumoylation of lysine residues of the host cell cycle factors PCNA and RBR [63]. This study provides insight into how Rep controls the host cell machinery to facilitate viral replication.

The C-terminal portion of Rep possesses DNA helicase and ATPase activity, and based on domain homology, Rep has been classified as a member of superfamily 3 (SF3). However, Rep differs from other SF3 helicases by the absence of an arginine finger domain and its oligomerization properties [64]. Geminiviral Rep proteins can oligomerize to form complexes ranging from hexamers to dodecamers [65]. ToLCGV Rep oligomers have been shown to be stable even at high salt and low protein concentrations. The mutation K227A at the C-terminus of ToLCGV Rep abolishes its ability to bind to ATP and ssDNA [66], which is a further indication of the structural and functional similarity of geminiviral Rep proteins to SF3 helicases [66]. A recent study has identified conserved amino acids in the B’ motif in the C-terminus of ToLCNDV Rep that are necessary for replication of ToLCNDV. Mutation of these conserved residues negatively impacted the replication of the virus in planta. Interestingly, the mutant variants of the protein affected the helicase activity of Rep without affecting other activities, such as ATPase activity and ssDNA binding activity, that are associated with the C-terminal region of the protein, stressing the significance of helicase activity in geminiviral DNA replication [67].

Role of Rep in geminivirus transcription

Rep is a multifunctional protein that regulates the transcription of Rep and BC1 in addition to its principal role in replication [68]. In TGMV DNA-A, the regulatory rep-binding site essential for replication initiation is closely associated with a sequence that is important for its own transcription. The binding of Rep to this site negatively regulates Rep transcription, which could act as a switch that sets back the repression of early genes and further activation of late genes involved in the cell-to-cell movement. Although the BC1 promoters of TGMV DNA-B share homologous sequence due to sequence conservation, the BC1 transcription remains unaffected [68]. In contrast, Rep represses the active transcription of both Rep and BC1 in ACMV. However, the phenomenon of Rep autoregulation is not observed in members of the genus Curtovirus such as beet curly top virus (BCTV) and beet severe curly top virus (BSCTV) [69]. Interestingly, Rep enhances coat protein expression in the mastrevirus wheat dwarf virus (WDV) [70]. The binding site for autoregulation of its transcript is in a conserved iteron that lies between the transcription start site and the TATA box [71]. Interestingly, the CaMV 35S promoter, which contains the sequence of the Rep binding site is also repressed by Rep [72]. It has been observed that the ability of Rep to function in geminiviral replication and transcript autoregulation are independent events [72]. In the case of TYLCSV C1, a highly conserved RGG motif at position 124-126 in the N-terminal region has been reported to be associated with autoregulation [73]. The significance of autoregulation can be attributed to the expression of the AC2 and AC3 genes, since the transcription start site lies within the coding region of AC1 [69]. It is important to note that AC2 is vital for suppressing host defenses and expression of late genes, and its role will be discussed further in the following sections.

Role of Rep in stimulation of viral transcription

Chilli leaf curl virus (ChiLCV) forms a minichromosome-like structure by interacting with histone proteins, and to enhance viral gene transcription [74], ChiLCV Rep hijacks the host ubiquitin machinery. ChiLCV Rep interacts with ubiquitin-conjugating enzyme 2 (NbUBC2), and histone monoubiquitination1 (NbHUB1) in the nucleus of the host cell [74]. Furthermore, ChiLCV Rep re-localizes NbUBC2 from the cytoplasm to the nucleoplasm. This results in an increase in monoubiquitination of histone 2B (H2B) and trimethylation of histone 3 at lysine 4 (H3K4me3) and subsequently stimulates geminivirus transcription [74].

Role of Rep in pathogenesis

Rep also acts as a silencing suppressor. Rep represses the expression levels of plant DNA maintenance methyltransferases such as methyltransferase 1 (MET1) and chromomethylase 3 (CMT3), thereby interfering with the plant methylation cycle, resulting in a reduction in CG methylation of both the viral genome and host-defense-related genes [75]. Rep is also a target of the plant immune system, as ATG8h, which is important for autophagy, interacts with tomato leaf curl Yunnan virus (TLCYnV) C1 and translocates it to the cytoplasm from the nucleus via exportin 1 to induce autophagy [76]. ChiLCV Rep can also aid in pathogenesis by relocalizing the positive regulator of pathogenesis phosphatidylinositol 4-kinase (PI4K) into the nucleus [77]. In a recent study, a 7-amino-acid stretch was identified at the C-terminus of Sri Lankan cassava mosaic virus-Columbia (SLCMV-Col) that is essential for the accumulation of Rep and is a determinant of the higher virulence of SLCMV-Col when compared to the weaker SLCMV-HN7 strain. Interestingly, the same 7-amino-acid stretch also enhanced the triggering of salicylic acid signaling against SLCMV [78].

AC2/C2

AC2/C2 as transactivator

AC2/C2 encodes a protein referred to as transcriptional activator protein (TrAP), which acts as a central factor in the viral life cycle. TrAP regulates the promoter activity of the viral genes AV1, BV1, and BC1 [79,80,81]. It also interacts with several proteins of geminiviruses, including C3, C4, V2, and βC1 [82]. TrAP has an N-terminal region containing an NLS, a central DNA-binding domain with a zinc finger, and a C-terminal transactivation domain. The transactivation activity of AC2 has been mapped to 15 amino acids at its C-terminal end [83]. AC2 and C2 are examples of position homologs in geminiviruses, because C2, unlike AC2, lacks a transactivation domain and transactivation activity [84]. Computational analysis of 124 bipartite and 463 monopartite begomoviral AC2/C2 proteins has suggested that they have a C-terminal α-helix, like many transcriptional activator proteins (acidic activation domain) [85]. Deletion of TGMV AC2 resulted in reduced coat protein expression, indicating that AC2 is involved in the expression of the coat protein (AV1) [80, 86]. In this case, TrAP activates the TGMV CP promoter by interacting with the transcription factor PEAPOD [87]. In the case of mungbean yellow mosaic India virus (MYMIV), AC2 and AC1 synergistically activate the promoter in DNA-A, driving the expression of CP [88]. C2 of bhendi yellow vein mosaic virus (BYVMV) has an NLS at its N-terminal end comprising amino acids 17-31. The NLS of C2 interacts with karyopherin α, which is involved in the shuttling of molecules between the nucleus and the cytoplasm [89].

AC2/C2 as a silencing suppressor

AC2/C2 from a begomovirus was the first protein reported to have silencing suppressor activity [90]. Transgenic plants overexpressing AC2/C2 show reduced global cytosine methylation of the host genome [91]. BSCTV-C2 reduces the methylation level of the promoter of the genes studied in the plant by reducing the amount of siRNA corresponding to the locus of the methylated promoter [92]. The AC2 proteins of different geminiviruses exhibit varying strength of silencing [93] and employ different mechanisms to suppress the host silencing machinery (Table 2).

Table 2 Diverse silencing suppression mechanisms employed by AC2/C2 and their significance

AC2/C2 as a pathogenicity determinant

AC2/C2 also plays a role in symptom development, suppression of HR, and inhibition of hormone-mediated defense. The 16-amino-acid hypervariable region in the C-terminus of C2 of tomato yellow leaf curl Sardinia virus (TYLCSV) induces HR [102]. Furthermore, the BYVMV C2 is essential for pathogenicity, whereas a virus with a termination codon in the C2 ORF caused less symptom development and grew to a lower titer. Symptoms were no longer observed when the virus was inoculated with a betasatellite [103]. The HR induced by proteins of geminiviruses is suppressed by the protein encoded by AC2. The HR induced by ToLCNDV NSP is inhibited by AC2 of the same virus [104]. Similarly, the HR induced by V2 of papaya leaf curl virus (PaLCuV) and cotton leaf curl Kokhran virus (CLCuKoV) is countered and suppressed by C2 of PaLCuV and cotton leaf curl Multan virus (CLCuMuV) [105].

Expression of ACMV AC2 in N. tabacum resulted in the upregulation of repressors of the jasmonic acid (JA) signaling pathway [106]. Tomato yellow leaf curl Sardinia virus (TYLCSV) C2 affects JA signaling by interfering with the ubiquitin pathway. TYLSCV C2 interacts with COP9 signalosome 5 (CSN5) and alters the derubylation activity of the CSN complex, which affects downstream signaling pathways such as those of auxin, gibberellic acid (GA), ethylene (ET), salicylic acid (SA), and JA [107]. Transcriptome analysis and challenge inoculation studies in transgenic A. thaliana plants expressing TYLSCV C2 have also suggested that TYLSCV C2 mediates suppression of JA-mediated defense [108]. TYLCV C2, like TYLSCV C2, compromises JA-mediated defense by interacting and altering the plant ubiquitin machinery, resulting in reduced degradation of jasmonate ZIM-domain 1 (JAZ1), a repressor of the JA signaling pathway, facilitating vector infestation in the infected plants [109].

AC3/C3

AC3/C3 encodes a replication enhancer protein (REn). Deletion of AC3 results in a reduction in viral DNA replication and symptom development [86, 110]. This observation is supported by recent evidence of TYLCV C3 recruiting DNA polymerase α and δ in N. benthamiana [111]. Some families of geminiviruses lack C3, and alternative mechanisms might be in place for such viruses. Furthermore, the AC3 from one virus can complement the function of AC3 in a related geminivirus [112]. REn interacts with geminivirus Rep and other replication-associated host factors to enhance viral DNA replication. In fact, REn enhances the ATPase activity of Rep in vitro. In addition, REn can form higher-order homo-oligomers through the hydrophobic domain in the central region of the protein [113, 114]. The hydrophobic region of REn is also essential for the interaction with PCNA, a DNA clamp essential for DNA replication. REn also interacts with RBR via polar residues in its N-terminal and C-terminal domains [114,115,116] and with the transcription factor NAC1 in tomato, leading to increased expression of NAC1. REn and NAC1 localize in the nucleus, and amino acids 1-70, comprising a putative α-helix of REn, are essential for this interaction. Overexpression of NAC1 enhances viral replication [117]. AC3 also enhances gene silencing in some geminiviruses [118].

AC4/C4

This ORF also encodes a multifunctional protein, and proteins from different viruses exhibit differences in their subcellular localization and functions, suggesting an incomplete functional overlap [84].

Role in symptom development and as an oncogenic protein

Deletion of tomato leaf curl virus (ToLCV) C4 leads to reduced symptom development in different hosts, but it does not affect the viral titer [119]. Transgenic N. benthamiana plants expressing BCTV C4 show tissue distortion and development of enations containing a large clustered mass of unorganized cellular material [120]. This can be suppressed by overexpression of pep receptor 2 (PEPR2), a receptor kinase associated with the danger peptide signaling pathway [121]. ACMV C4 also induces abnormal development in transgenic A. thaliana plants, which could partly be attributed to decreased miRNA accumulation [122]. There are many possible mechanisms for the induction of unorganized cell growth, which are discussed below. BSCTV C4 is S-acylated in planta, and S-acylated C4 interacts with CLAVATA1 (CLV1), a receptor kinase, which plays an important role in meristem maintenance. As a consequence, the expression level of WUSCHEL (WUS) is altered, resulting in abnormal development of the plant [123]. Cotyledons and hypocotyledons from transgenic A. thaliana expressing BSCTV C4 under the control of an inducible promoter show extensive cell division with no clear demarcation of vascular bundles [124, 125]. These observations indicate that C4 acts as a viral “oncogene” inhibiting the DNA damage checkpoint, resulting in cell cycle progression while also stimulating DNA replication by preventing programmed cell death [126]. BSCTV C4 expression in A. thaliana results in elevated levels of cell-cycle-associated proteins such as the cyclins cyc1, cyc2, and cyc2b, the cyclin-dependent kinases (CDK) cdc2a, cdc2b, and cdc25, and the cyclin-activated kinases (CAKs) cak1, cak2, and cak3. Some cell cycle inhibitors were also found to be suppressed in transgenic plants expressing C4, and upon BSCTV infection as well [127]. Furthermore, C4 stabilizes CDKs. BSCTV C4 induces the expression of RING finger E3 ligase, which is known to interact with the cell cycle inhibitors ICK and KRP and promote their degradation [128]. Recently, it has been demonstrated that TLCYnV C4 impairs phosphorylation-dependent degradation of CycD1;1 by Shaggy-like kinase (SKη) kinase by relocalizing SKη from the nucleus to the plasma membrane. Reduced phosphorylation-dependent degradation of CycD1;1 leads to the induction of cell division [129].

Role as silencing suppressor

ACMV AC4 synergistically suppresses host PTGS along with EACMCV AC2 and enhances EACMCV DNA accumulation by approximately eightfold [130]. ACMV AC4 interacts with single-stranded siRNA and miRNA, but not with any double-stranded RNA forms, indicating that ACMV AC4 blocks PTGS at the mature stage of small RNA biogenesis [122]. Specific localization of AC4 seems to be essential for its silencing suppressor activity. EACMCV AC4 is localized in the plasma membrane, perinucleus, and cytoplasm. It has a consensus N-myristoylation site, and mutations such as G2A and C3A abolish its plasma membrane localization as well as its silencing suppressor activity [131]. Furthermore, C4 interacts with BAM1 (barely any meristem 1), which is a receptor-like kinase (RLK) as well as a positive regulator of cell-to-cell movement of silencing, and it blocks systemic silencing in the host [132, 133]. MYMV AC4 binds to 21- to 25-nucleotide siRNAs and targets them to the plasma membrane via S-palmitoylation [134]. Together, this indicates that AC4/C4 exhibits silencing suppressor activity by blocking the cell-cell movement of siRNAs. ToLCV-C4 interacts with SKη from tomato in the nucleus region and this interaction has been mapped to 12 amino acids in the C-terminal portion of C4. Deletion of these 12 amino acids abolished this interaction as well as the silencing suppressor activity of C4 [135]. C4 proteins from different viruses show different affinity towards SKη, and interestingly, the symptom-induction property of C4 appears to correlate with its affinity for SKη [136]. Phosphorylation of C4 by SKη and its subsequent myristoylation are also essential for the nucleocytoplasmic shuttling and pathogenicity of C4 in the case of TLCYnV [137]. Like other geminiviral suppressors of RNA silencing (VSRs), CLCuMuV C4 interacts with and inhibits the activity of the essential methyl cycle enzyme S-adenosyl methionine synthetase (SAMS), resulting in suppression of both TGS and PTGS [138]. Using another strategy, TLCYnV C4 suppresses TGS by disrupting self-interaction of N. benthamiana domains rearranged methylase 2 (NbDRM2), a major methyltransferase, which catalyzes the addition of methyl groups on cytosine of viral DNA. As a result, NbDRM2 no longer binds to DNA, resulting in suppression of TGS. Plants infected with TLCYnV C4 harboring an S43A substitution mutation (which has impaired ability to disrupt NbDRM2 interaction) showed higher recovery [139].

Role in systemic movement

There is evidence that C4 complements the function of viral movement. For example, in the case of TYLCV, the systemic movement of TYLCV in tomato was completely blocked as a result of mutation in C4. However, in N. benthamiana, a mutation in TYLCV C4 did not impair the systemic movement of the virus but did result in a reduction in the severity of the symptoms. Similar observations were also made in the case of BSCTV, where a virus with two termination codons in the C4 ORF could replicate in protoplasts and leaf discs but could not establish symptoms in A. thaliana and N. benthamiana plants. The newly emerged leaves from the mutant-virus-infected plants showed no viral DNA accumulation, while the exogenous application of wild-type BSCTV C4 could complement the movement function of mutated BSCTV-C4 [140]. Interestingly, TYLCV C4 can provide the function of movement in the bipartite begomovirus ToLCNDV [141].

Other functions

In addition to the functions of AC4/C4 discussed above, recent reports have demonstrated its ability to suppress HR [142] and SA-mediated defense [143] as well as confer drought stress tolerance independent of abscisic acid in plants [144]. The mechanism of HR suppression involves preventing self-association of the hypersensitive-induced reaction 1 (HIR1) protein and elevating the level of the leucine-rich-repeat (LRR) protein, which promotes degradation of HIR1 [142]. Suppression of SA-mediated plant defense occurs by a mechanism that is conserved among different phytopathogens. C4 translocates from the membrane to the chloroplast, where it suppresses calcium-sensing signal (CAS)-mediated activation of SA signaling. This is supported by the observations that knockout lines of CAS or depletion of SA enhances virus accumulation and can complement C4-null mutations in the virus [143]. The precise mechanism by which C4 confers drought stress tolerance is not known. AC4 of ToLCNDV is also an avirulent gene in the case of the resistant tomato cultivar H-88-78-1, which harbors the resistance gene SlSw5a. SlSw5a interacts with ToLCNDV AC4 to elicit HR and the production of reactive oxygen species to limit the spread of the virus. The amino acid motif “RTSK” present in the C-terminal portion of AC4 is essential for this interaction [145]. The interaction of AC4/C4 with kinases and their significance in silencing suppression is discussed above. A recent finding suggests that TYLCV C4 can also interact with kinases, including NSP-interacting kinase 1 (NIK1), which plays various roles in the pathogenesis of bipartite begomoviruses. The precise mechanism of pathogenesis is described in a later section. The interaction of TYLCV C4 with NIK1 raises the question whether C4 perform the function of the DNA-B-encoded nuclear shuttle protein in monopartite begomovirus, and this will require elaborate experimental studies [146].

Despite its multifunctional nature, this ORF is the least conserved ORF of geminiviruses. The diversity of AC4/C4 provides an adaptive advantage to the virus, since it does not perform any obligatory function in basic processes such as replication, transcription, or encapsidation, suggesting that it is dispensable. However, AC4/C4 supports virus growth by silencing suppression of RNAi, and in viruses without AC4/C4, other proteins such as TrAP, AV2, or Rep might take over this role.

AC5/C5

The AC5/C5 ORF is located downstream of AC3/C3 and overlaps with ORF-AC1 [147,148,149]. The function of the AC5/C5 protein in geminiviruses has not been studied as well like other geminivirus ORFs. AC5 was found to be essential for DNA replication in the case of MYMIV [150]. Deletion of tomato leaf deformation virus (ToLDeV) C5 has been shown to reduce symptom severity [149]. Furthermore, MYMIV AC5 also acts as a silencing suppressor and a pathogenicity determinant. Mutations in MYMIV AC5 have been shown to result in a decrease in infectivity and viral DNA accumulation. AC5 can induce a hypersensitive response when carried by a potato virus X (PVX) vector, but this phenotype has not been observed in MYMIV-infected plants. AC5 also suppresses PTGS, and its N-terminal region is essential for PTGS suppressor activity. Interestingly, AC5 can also suppress TGS by suppressing the expression of domain-rearranged methyltransferase 2 (DRM2), a methyltransferase responsible for de novo CHH methylation and its maintenance. Thirty-three amino acids at the C-terminus of the protein have been shown to be essential for TGS suppressor activity as well as symptom development [176].

DNA-B

After replication, viral DNA must be transported to neighboring cells for successful infection. To achieve this, it must cross the nucleus through a nuclear pore to reach the cytoplasm, and from the cytoplasm it has to move intracellularly to the plasmodesmata to reach uninfected cells. In the case of bipartite viruses, these functions are mediated by proteins encoded by DNA-B. DNA-B contains two ORFs: one on the virion strand, designated as BV1, and one on the complementary strand, designated as BC1.

BV1

The BV1 ORF of DNA-B of bipartite viruses encodes a nuclear shuttle protein (NSP), which is responsible for nucleocytoplasmic shuttling of the viral DNA (vDNA). The NSP has a strong affinity for nucleic acids [151,152,153,154]. The mechanism of nuclear export involves the interaction and cooperation of several host factors. For example, NSP interacts with histone 3 (H3) to form an H3-NSP-viral DNA complex that allows the viral DNA to be exported from the nucleus [155]. The NSP possesses a nuclear export signal (NES), but the exportin has not been identified. The region including amino acids 177-198 in the NSP of SqLCV is leucine-rich. It has been shown that the TFIIIA protein of Xenopus could complement the NES of NSP [156]. The NSP, along with viral DNA, once it reaches the nuclear envelope, is released into the cytoplasm with the help of NIG (NSP-interacting GTPase), a cytosolic GTPase that accumulates around the nuclear envelope in the cytosol. It acts as a cofactor in mediating the intracellular movement of the viral genome. NIG interacts with NSP and shuttles from the nucleus to the cytoplasm [157, 158]. In the cytoplasm, yet another cofactor, NISP (NSP-interacting syntaxin domain-containing protein), a plant-specific syntaxin-6 protein, interacts with the NIG-NSP-vDNA complex and facilitates the intracellular movement of the complex from the cytosol to endosomes [159].

In addition to its role in viral export, BV1 also acts as a pathogenicity determinant. Cabbage leaf curl virus (CaLCuV) NSP suppresses host immunity by inducing expression of the asymmetric leaves 2 (AS2) protein and translocating it out of the nucleus to the cytoplasm. AS2 activates the decapping enzyme DCP2 in the cytoplasm, compromising host defense against the virus [160]. Furthermore, CaLCuV NSP mimics the transcription factor MYC2 and suppresses terpene biosynthesis, making the host more attractive to vectors, thereby increasing the chances that the virus will be transmitted [161]. A 38-amino-acid region close to the NES in CaLCuV NSP interacts with the A. thaliana nuclear shuttle protein interactor (NSI), which plays a role in histone acetylation. The NSP-AtNSI interaction results in acetylation of coat protein bound to viral DNA. Then, the NSP-AtNSI complex replaces CP in the CP-viral DNA complex and exports the viral DNA from the nucleus. Overexpression of AtNSI results in a higher viral titer, while the mutation in NSP that abolishes NSI-AtNSP interaction results in reduced symptom severity and delayed systemic spread of the virus [151, 162, 163]. NSP also interacts with a group of kinases referred to as NIKs. NIK1-3 are leucine-rich repeat receptor-like kinases (LRR-RLK) that localize to membranes. Interaction between NSP and NIK leads to suppression of the kinase activity of NIKs. The loss of function of NIKs supports the replication of CaLCuV. NIK has an 80-amino-acid stretch that contains the putative serine-threonine kinase active site and activation loop, which are the targets of NSP. Loss of NIK1-3 enhances the susceptibility of the host [164]. It has been observed that NIK1 activation leads to phosphorylation and translocation of ribosomal protein 10a (RPL10a) from the cytoplasm to the nucleus [165]. In the nucleus, RLP10a represses global translation by interacting with L10-interacting MYB domain-containing protein (LIMYB). The RPL10-LIMYB interaction leads to suppression of transcription of ribosomal protein genes, resulting in reduced translation of viral proteins [166]. Yet another host factor that interacts with CaLCuV NSP is NsAK (NSP-associated kinase), which is a proline-rich-extension-like receptor protein kinase (PERK). NsAK phosphorylates NSP, and the loss of NsAK results in a decrease in viral titer [167]. ToLCNDV NSP induces a hypersensitive response in N. tabacum and tomato through the N-terminal region of the protein, whereas in N. benthamiana, the same protein induces leaf curling symptoms similar to those associated with viral infection [168].

BC1

BC1 codes for the movement protein (MP), which is responsible for the cell-cell movement of viral DNA. MP increases the size exclusion limit of plasmodesmata of mesophyll cells, thereby facilitating the movement of the virus [152]. MP localizes as small punctuate bodies in the cell periphery and around the nucleus. In the sink leaves of the host, it forms a disc-like structure in the periphery. When BC1 is inoculated with its cognate DNA-A and DNA-B, it forms needle-like structures in sink leaves without altering its subcellular localization [169,170,171,172]. MP interacts with synaptotagmin (SYTA), a regulator of recycling of endosomes to promote transport of the viral genome to plasmodesmata for cell-cell transport through the SYTA-mediated endosomal recycling pathway. The role of SYTA in endosome recycling is evident by the depletion of plasma-membrane-derived endosomes in dominant negative mutant lines [173]. As discussed in the earlier section pertaining to NSP, NISP recruits NSP-NIG-vDNA to endosomes, and from the endosomes, the MP facilitates the transport of the vDNA complex to uninfected neighboring cells through plasmodesmata [174].

BC1 proteins from different viruses differ in their affinity for different forms of DNA. BC1 from SqLCV and bean dwarf mosaic virus (BDMV) has weak affinity for single-stranded DNA, whereas BC1 from MYMIV has strong affinity for single-stranded DNA [153, 154, 175].

Transgenic plants expressing tomato mottle virus (ToMoV) MP and BDMV MP exhibit an abnormal phenotype, with symptoms resembling those induced by viral infection [176, 177]. In one study, plants expressing a mutated form of ToMoV BC1 did not produce viral-like symptoms and showed resistance to ToMoV and CaLCuV infection, possibly because of a transdominant negative effect [177].

Satellite molecules

Monopartite begomoviruses are often associated with satellite DNA molecules referred to as alphasatellites, betasatellites, or, more recently, deltasatellites. Satellite molecules are dependent on helper viruses for their replication and movement and are nearly half the size of the helper virus genome [178, 179].

Alphasatellites

Alphasatellites are approximately 1375 nt in size, with an A-rich region ranging from 150 to 200 nt, a hairpin loop containing the conserved nonanucleotide for replication, and an ORF encoding a Rep protein, referred to as alpha-rep, of approximately 37 kDa in size. Since alphasatellites encode their own Rep, they are not strictly considered satellites. Alphasatellites have not been shown to play a role in symptom development or pathogenicity [179], but they negatively affect the transmission of some helper viruses by whiteflies and affect the DNA level of some betasatellites [180, 181]. Some Rep proteins encoded by alphasatellites such as those of Gossypium darwinii symptomless alphasatellite, Gossypium mustelinum symptomless alphasatellite, and cotton leaf curl Multan alphasatellite, have been reported to exhibit silencing suppressor activity as well [182, 183].

Betasatellites

Unlike alphasatellites, betasatellites are essential for their helper viruses, as they function as pathogenicity determinants. They are known to induce symptoms such as leaf curling, enations, and yellowing [184, 185]. Betasatellites are approximately 1350 nt in size with no sequence similarity to the helper virus other than the conserved nonanucleotide. Betasatellites have a satellite conserved region (SCR), which is highly conserved among betasatellites, an A-rich region, and a single ORF on the complementary strand, encoding a protein referred to as βC1. βC1 is a multitasking protein that plays an essential role during pathogenesis by suppressing host TGS and PTGS, suppressing host defense, and promoting symptom development (Fig. 2) [179, 184, 186]. Covering the work on βC1s merits a separate review, but considering the space limitations, this review will only touch upon its key roles with examples for illustration.

βC1 as a pathogenicity determinant and silencing suppressor

βC1 is a pathogenicity determinant that plays a vital role in viral pathogenesis. It has evolved multiple strategies to mitigate the host defense. It interacts with several host factors to evade host defense mechanisms and support helper viruses in establishing disease [184]. βC1 induces characteristic symptoms in the leaves of the infected host, which is mainly due to the interaction of βC1 with the host factors. βC1 from TYLCCNV interacts with AS1, replacing AS2, and interferes with the normal leaf development. The AS1-AS2 interaction is essential for downregulation of miR165/166 as well as upregulation of the transcription factor HD-ZIP III, which are essential for normal leaf development [187]. βC1 also induces vein clearing or yellowing in veins. βC1 from radish leaf curl betasatellite (RaLCB) has been shown to be localized in chloroplasts, where it alters their ultrastructure as well the expression of chloroplast-encoded proteins, resulting in reduced photosynthesis. Furthermore, RaLCB βC1 interacts with oxygen-evolving enhancer protein 2 (encoded by PsbP) and interferes with its binding to RaLCB DNA, which is essential for plant immunity to the virus, as silencing PsbP transiently increases the viral load [188, 189]. A recent study has suggested that βC1 from tomato leaf curl Patna betasatellite (ToLCPaB) regulates the titer of the helper virus as well as the betasatellite and the transcript accumulation of Rep and βC1 through its ATPase activity. Interestingly, ATPase activity is conserved in βC1 from diverse betasatellites. Unlike the Rep protein, βC1 lacks the canonical Walker A and Walker B motifs that are essential for ATPase activity, and they have a non-canonical ATPase domain that overlaps with the DNA-binding domain [190]. βC1 also interacts with the host ubiquitin system, resulting in symptom development. βC1 interacts with the ubiquitin-conjugating enzyme SlUBC3 in tomato, resulting in reduced polyubiquitination of many target proteins. The myristoylation-like motif GMDVNE at the C-terminus of βC1 is essential for its interaction with SlUBC3, and mutation in this site results in reduced symptom severity [191]. In another example, tomato yellow leaf curl China betasatellite (TYLCCNB) βC1 interacts with a RING-finger protein from tobacco (NtRFP1), which is an E3 ubiquitin ligase that is involved in polyubiquitination of βC1 to promote degradation of βC1 via the 26S proteasomal pathway [192]. It is not clear whether the NtRFP1-βC1 interaction is a survival mechanism of the plant or the virus, but experimental evidence shows that the plant gains immunity against the virus by promoting the proteasomal degradation pathway through NtRFP1 [192]. A recent report also suggested that βC1 undergoes SUMOylation through its SUMO-interacting motif to escape from degradation in the host [193].

βC1 nullifies the antiviral defense of plants by suppressing both TGS and PTGS. βC1 of tomato yellow leaf curl China betasatellite (TYLCCNB) can complement BCTV C2 and reverse TGS by directly interacting with S-adenosylhomocysteine hydroxylase (SAHH), which is an essential component of the methyl cycle, replenishing the methyltransferase cofactor S-adenosyl methionine (SAM). The βC1-SAHH interaction leads to a decrease in SAM [194]. Tyrosine residues at positions 5 and 110 are essential for the reversal of TGS activity [195]. Transgenic plants expressing cotton leaf curl Multan betasatellite (CLCuMuB) βC1 were shown to have elevated levels of both AGO1 and DCL1 gene expression, and the CLCuMuB βC1 protein was found to physically interact with AGO1 [196]. In another mechanism, βC1 suppresses host silencing by upregulating a silencing suppressor. One such silencing suppressor is the calmodulin-like protein rgs-Cam, which is an endogenous suppressor of the gene silencing pathway that suppresses the expression of RDR6. Overexpression of Nbrgs-Cam has been shown to produce a phenotype similar to that caused by βC1 in plants [197].

The TYLCCNB βC1 protein interacts with sucrose non-fermenting-1-related kinase from Solanum lycopersicum (SlSnRK1). SlSnRK1 inactivates βC1-mediated pathogenicity by phosphorylating βC1 at serine 33 and threonine 78. Expression of βC1 with phosphomimic mutants and overexpression of SlSnRK1 result in a delay in viral symptoms as well as a decrease in the viral titer [198]. SlSnRK1 also phosphorylates tyrosine at positions 5 and 110 resulting in milder disease symptoms, a weakened reversal of TGS, and a lack of interaction with AS1 [195].

TYLCCNV, together with TYLCCNB βC1, induces upregulation of NbCaM and NbCaM, which in turn interact with SGS3, resulting in degradation of SGS3, mediated by the phosphatidylinositol 3-kinase complex. This suggests a role in the βC1-dependent autophagy in geminivirus infection [199]. ATG8, an autophagy-related protein, interacts with CLCuMB βC1 in a manner involving a valine residue at position 32. A V32A mutation in βC1 was found to enhance symptom severity and accumulation of viral DNA. Interestingly, silencing of other autophagy-related proteins, ATG5 and ATG7, made plants susceptible to different viruses [200]. A recent study suggested a novel role of TYLCNNB βC1 in which it interferes with the mitogen-activated protein kinase (MAPK or MPK) pathway to mitigate the host defense. It selectively inhibits the activity of MAPK pathway members such as MPK4 and MAPK kinase 2 (MKK2) to suppress the host defense [201].

βC1 in systemic movement

In infections with monopartite begomoviruses, βC1 can also complement functions associated with DNA-B of bipartite viruses. ToLCNDV-A alone can induce local movement but is impaired in the systemic movement of the virus. However, ToLCNDV-A inoculated with CLCuMuB betasatellite can induce both local and systemic movement. Interestingly, ToLCNDV-A, when inoculated with a CLCuMuB betasatellite with a disrupted βC1 gene failed to exhibit systemic movement. Similar observations were made in the case of ageratum yellow vein virus [141, 202]. βC1 possesses an NES or NLS and interacts with the CP of BYVMV and the host nuclear import protein karyopherin α. The failure of βC1 to localize in the nucleus results in a lack of symptom development [203, 204].

Host-vector-virus tripartite interaction

The βC1 protein promotes host-vector-virus tripartite interaction [184] by suppressing JA signaling and promoting emission of the volatile compound linalool [205]. The βC1 protein interferes with AS1/AS2 complex formation [187] as well as interfering with dimerization of the transcription factor MYC2 [161]. Both of these interactions suppress JA-mediated synthesis of terpenes and other metabolites (which repel insects), resulting in the attraction of more insects to infected plants. Apart from increasing the vector activity on infected plants, βC1 can also deter infestation of infected plants by non-vector insects. Such observations have been reported in the case of CLCuMuV and its associated betasatellite, in which βC1 binds to the transcription factor WRKY20 in the phloem to modulate chemical immunity in the host to support propagation of its vector (i.e., whitefly) and deter non-vectors such as cotton bollworms and aphids [206].

Satellite- and DNA-B-encoded proteins share functional homology in several aspects. For instance, both βC1 and DNA-B-encoded-proteins assist in the movement of the viral genome, and βC1 and BV1 mimic other proteins to suppress JA acid signaling to attract whiteflies for transmission. This clearly demonstrates that although they diverged in the process of evolution, these proteins still retain properties that are essential for pathogenesis.

βV1 and other small ORFs

βV1 is a novel ORF discovered very recently encoded on the viral-sense strand of the betasatellite genome. This ORF is conserved in position and sequence in nearly 40% of betasatellite sequences. Intriguingly, it positively affects viral infection by triggering HR in leaves of N. benthamiana [207]. Exploration of genome sequences for small ORFs encoding microproteins has led to the discovery of possible functional roles for small ORFs. An important and interesting study by Gong et al. revealed the presence of six small ORFs in the TYLCV genome. One of these, ORF6, also designated as V3, is conserved among begomoviruses. Interestingly, a TYLCV variant with a mutated V3 exhibited an impaired ability to induce symptoms in the host and was also found to silence both TGS and PTGS suppressors [208].

Conclusions and future directions

Over the past few years, studies have established the multifarious nature of geminivirus proteins and the complexity of their interaction with host factors. These studies have identified interactions that are essential for infection and provided insight into how geminiviruses redirect plant processes and counteract host defense responses [209, 210]. However, structural studies of these viral proteins could shed more light on the mechanism involved in their interaction with host factors. It would also help to identify the protein domains required for the various functions performed by geminivirus proteins. With the recent identification of positional homologs with incomplete functional overlap in proteins such as AC2/C2 and AC4/C4 suggesting that novel functions of these proteins remain to be discovered, one needs to be cautious when predicting the function of an ORF based on its position [84]. Furthermore, the discovery of novel ORFs encoding small proteins such as βV1 and V3 might suggest additional complexity in host-virus interactions. Application of new and emerging techniques in the fields of cell biology, molecular biology, and biochemistry, especially systems biology and CRISPR-Cas9-related techniques, will provide a new dimension and improve our understanding of one of the largest families of plant viruses.