Keywords

1 A Brief Introduction to RNA Processing Events, Interconnections to Transcription and Export

Eukaryotic pre-mRNAs must be processed in order to become fully functional mRNAs. Most mRNA precursors (pre-mRNAs) undergo three processing steps; the 5′ end is capped by addition of 7-methylguanosine, introns are removed and exons ligated by splicing and the 3′ end is created by an endonucleolytic cleavage followed by addition of a 100–300 nt long poly(A) tail. It is becoming increasingly clear that these processes are cotranscriptional events rather than posttranscriptional, with the C-terminal domain of the largest subunit of RNA polymerase II (CTD), consisting almost entirely of the heptapeptad repeat (consensus YSPTPS), playing an important role in coupling RNA processing and transcription. The CTD forms a scaffold or platform to recruit processing factors on the pre-mRNA (reviewed in Hirose and Manley 2000; Proudfoot et al. 2002; Maniatis and Reed 2002; Bentley 2005), and phosphorylated CTD plays an integral role as a participant of capping (Shatkin and Manley 2000), splicing (Hirose and Manley 2000; Bentley 2005), chromatin remodeling (Rosonina et al. 2014) as well as 3′ processing (Bentley 2002; Hirose and Manley 1998; Hsin et al. 2014b; Hsin et al. 2011) and expression of upstream antisense RNAs (ua RNAs) (Hsin et al. 2014a; Descostes et al. 2014). RNA processing events in turn are highly interlinked and play important roles in influencing transcriptional elongation and termination (reviewed in Bentley 2005; Rosonina et al. 2006; Pandit et al. 2008). The formation of a transport competent mRNP complex is also closely coordinated and coupled to transcription and all processing events, and quality control mechanisms exist to ensure that only correctly processed mRNAs are exported (reviewed in Rodriguez et al. 2004; Luna et al. 2008).

2 RNA Processing Factors as Sumoylation Substrates

The small ubiquitin related modifier (SUMO) has gained prominence as a posttranslational modifier that regulates a large number of biological processes, including transcription, DNA repair , genome stability , chromatin organization, PML body function, nucleocytoplasm ic transport , to name a few (Johnson 2004). The pathway by which SUMO is conjugated to substrate proteins and the enzymes of the pathway have already received an excellent introduction in the earlier chapters of this book. SUMO has garnered a great deal of interest primarily due to its ability to variously influence substrate function, through protein stability, subcellular localization and altering interactions with other proteins (reviewed in Johnson 2004; Hay 2005; Geiss-Friedlander and Melchior 2007). While the addition of SUMO to a substrate would by itself change substantially its interaction surfaces, the involvement of SUMO-Interacting Motifs (SIMs ) in several proteins that can interact with SUMO or sumoylated proteins noncovalently lends another dimension to SUMO regulated interactions (Minty et al. 2000; Song et al. 2004; Hecker et al. 2006). SUMO substrates have been shown to cluster in macromolecular complexes in proteomic analyses (Wohlschlegel et al. 2004) and the presence of more than one SUMO substrate in a functional complex has often shown to be involved in the assembly of such complexes (reviewed by Matunis et al. 2006).

The number of studies detailing sumoylation of RNA processing factors is by no means comparable to that of say, transcription or DNA repair , but the number keeps growing. Studies describing sumoylation of RNA binding proteins and factors involved in 3′ pre-mRNA processing, transcription termination, RNA editing and mRNA export (Vassileva and Matunis 2004; Desterro et al. 2005; Vethantham et al. 2008, 2007; Xu et al. 2007; Lamoliatte et al. 2014; Richard et al. 2013) have helped to expand the role of this modifier to the field of RNA processing and metabolism (reviewed in Rouviere et al. 2013). Developing large scale mass spectrometry (MS) -based proteomics and affinity purification strategies using for example tagged SUMO peptides, several groups have identified a number of proteins involved in RNA processing events such as capping , splicing , polyadenylation and mRNA export in yeast (Panse et al. 2004; Wohlschlegel et al. 2004; Wykoff and O’Shea 2005; Hannich et al. 2005; Denison et al. 2005), mammals (Zhao et al. 2004; Vertegaal et al. 2004, 2006; Li et al. 2004; Manza et al. 2004; Rosas-Acosta et al. 2005; Gocke et al. 2005; Guo et al. 2005; Golebiowski et al. 2009; Becker et al. 2013; Lamoliatte et al. 2014; Liu et al. 2015; Bruderer et al. 2011; Hendriks et al. 2014; Matic et al. 2010; Tammsalu et al. 2014; Schimmel et al. 2014; Blomster et al. 2009; Tatham et al. 2011), flies (Nie et al. 2009), worms (Kaminsky et al. 2009) and plants (Miller et al. 2010). In one of these studies, a significant proportion (17%) of the SUMO-modified proteins identified were found to be involved in RNA-related processes (Denison et al. 2005). A non-exhaustive list of RNA processing related proteins identified in yeast and human proteomic analyses is presented in Table 2.1.

Table 2.1 Non-exhaustive list of RNA binding proteins and mRNA processing factors identified in yeast and mammalian proteomic analyses

Enzymes of the SUMO pathway have been found to colocalize in nuclear bodies and substructures together with components of the RNA processing machinery. Members of the PIAS (protein inhibitor of STAT ) family of SUMO E3 ligases were found to be localized to nuclear speckles (Tan et al. 2002), which are subnuclear structures that are enriched for pre-mRNA splicing factors (Lamond and Spector 2003; Hall et al. 2006). One study has shown that SUMO-1 and the E2 conjugating enzyme ubc9 are localized to Cajal bodies , which are the sites of maturation of small nuclear ribonucleoproteins (snRNPs) required for pre-mRNA processing (Navascues et al. 2008). However, another study showed that ubc9 localized to nuclear speckles in mouse oocytes along with the splicing factor SRSF2 (previously SC35), one of the main components of nuclear speckles, and overexpression of ubc9 led to an increase in size of nuclear speckles (Ihara et al. 2008). Interestingly, a recent study shows that Ubc9 depletion in U2OS cells affects the cytoplasm ic distribution of specific intronless mRNAs and leads to the accumulation of SRSF2 into cytoplasmic foci (Zhang et al. 2014). It will be important to determine in the future the significance of these altered localizations, and whether this results in any changes in processing of mRNA precursors.

In the remainder of this chapter, we discuss SUMO targets related to mRNA metabolism along with a brief description of the RNA processing events themselves. Even though several of these putative SUMO targets were identified mainly by proteomic analyses, the clustering of these targets in key complexes involved in processes such as capping , splicing and polyadenylation make them worthwhile for discussion. Together these studies reveal that the involvement of SUMO in mRNA processing events and export is wider than previously thought, and support the possibility that SUMO plays a significant role in these processes.

3 5′ Capping

5′ capping is the first of the three processing reactions and takes place on nascent RNA polymerase II (RNAPII) transcripts when they are less than 50 nts long. The cap plays important roles in mRNA stability, maturation and translation . 5′ N-7methylguanosine caps are attached in a three steps reaction involving an RNA triphosphatase (RT) in the first step and guanyltransferase (GT) in the second step to add a guanosine nucleoside at the 5′ end. A methyltransferase (MT) functions in the third step for addition of the methyl group to the guanosine at the N7 position (reviewed in Shatkin and Manley 2000). In mammals, the RT and GT activities are encoded by a multifunctional capping enzyme (CE), however, in yeast these are present in two separate polypeptides, Cetl and Ceg1, respectively (reviewed in Shatkin and Manley 2000). 5′ capping is closely coupled to transcription elongation through the CTD, which recruits the CE to the pre-mRNA. GT binds specifically to the form of the CTD that is phosphorylated at Ser5 residues by a cyclin-dependent kinase associated with the general transcription factor TFIIH (McCracken et al. 1997; Schroeder et al. 2000). The transcription elongation factor Spt5 binds to the CE and both p-CTD and Spt5 stimulate CE or Ceg1/Cet1 to carry out the capping reaction (Wen and Shatkin 1999). At least in yeast, phosphorylated CTD also recruits MT (Abd1) to the cap and stimulates its activity. The capping enzymes also reciprocally function to enhance RNAPII transcription by stimulating promoter clearance (Mandal et al. 2004; Schroeder et al. 2004).

In yeast, all three capping enzymes were identified as putative sumoylation substrates (Panse et al. 2004). Cet1 has in fact been identified in multiple SUMO proteomic screens (Panse et al. 2004; Hannich et al. 2005, Wohlschlegel et al. 2004; Denison et al. 2005), strongly suggesting that it may be a specific SUMO target. That these factors are involved in multiple dynamic interactions with the transcriptional machinery provide possibilities that sumoylation may influence these interactions. In this regard, it is noteworthy that Spt5 was also found in a proteomic screen of SUMO-conjugated proteins (Zhou et al. 2005). It will be of interest in the future to determine precisely how SUMO functions in capping, and whether mammalian capping enzymes are also sumoylated .

4 Splicing

The removal of introns and the ligation of exons takes place in an extremely precise manner via two transesterification reactions. The splicing reaction takes place in a megadalton ribonucleoprotein complex called the spliceosome, which is comprised of five snRNP subcomplexes (U1, U2, U4, U5 and U6), each consisting of an snRNA and associated proteins. A host of other proteins such as RNA helicases and SR proteins assist in dynamic assembly of the complex and in enhancing intron recognition (reviewed in Brow 2002; Jurica and Moore 2003). Catalysis of the splicing reaction takes place by the ordered assembly of the various snRNPs on the pre-mRNA and the formation of several intermediate spliceosomal complexes, finally the catalytic C complex. Intron recognition, and both steps of catalysis, progress through a complex network of RNA-RNA, RNA-protein and protein-protein interactions between components of the snRNPs and intronic RNA sequences (reviewed in Brow 2002; Smith et al. 2008). Although most of the spliceosomal conformational changes are effected by ATP-dependent DExD/H box RNA helicases and unwindases (Cordin et al. 2006), transient protein modifications such as phosphorylation (Shi et al. 2006) and ubiquitination (Bellare et al. 2008) of snRNP -associated proteins have also been shown play a role in changing protein conformation to affect formation of splicing complexes. Ubiquitin-mediated interactions were shown to be important for the assembly of a multi-snRNP complex that joins the spliceosome as a single entity, the U4/U5/U6 tri-snRNP (Bellare et al. 2008).

The evidence linking sumoylation to splicing is, as with capping , almost entirely based upon proteomic reports. However, as opposed to capping, studies with mammalian systems revealed most of the splicing related targets. One splicing-related protein that has been validated as a SUMO target is SART1 (Vertegaal et al. 2004, 2006). SART1 is localized in nuclear speckles, has been shown to be a component of the U4/U5/U6 tri-snRNP, and is important for tethering of the tri-snRNP to the pre-spliceosome (Makarova et al. 2001). Several subunits of protein complexes associated with U2 snRNP , namely SF3A (SAP62, SAP114) and SF3B (SAP49, SF3b125, SAP130, SAP145, SAP155) were also found to be sumoylated (Manza et al. 2004; Guo et al. 2005; Rosas-Acosta et al. 2005; Golebiowski et al. 2009; Becker et al. 2013; Hendriks et al. 2014). These factors are essential for the assembly of the U2 snRNP complex and for proper tethering of the U2 snRNP to its intronic recognition sequence (Das et al. 1999; Will and Luhrmann 2001). In this regard, it is interesting that the SUMO E3 ligase PIAS1, which as mentioned above is found in nuclear speckles (Tan et al. 2002), was also found to co-purify with mammalian spliceosomes (Rappsilber et al. 2002). These results raise the possibility that sumoylation may influence the interactions of these factors and thus the assembly and/or function of spliceosomal complexes.

Another interesting class of splicing-related factors found in SUMO proteomic screens includes proteins that participate in other processing events and have roles in transcription. The polypyrimidine tract binding factor (PTB) associated splicing factor PSF forms a heterodimer with p54nrb and this multifunctional complex has been implicated in splicing, transcription initiation, cleavage and polyadenylation as well as transcription termination (Mathur et al. 2001; Emili et al. 2002; Rosonina et al. 2005; Liang and Lutz 2006; Kaneko et al. 2007). PSF and p54nrb were both identified as putative SUMO targets (see Table 2.1). PSF was validated as a SUMO substrate and sumoylation was found to promote its transcriptional repression properties (Zhong et al. 2006). PTB, which was also identified as a putative sumoylation target (Manza et al. 2004; Rosas-Acosta et al. 2005), is again a multifunctional RNA binding protein originally identified as a splicing repressor but also has roles in cleavage-polyadenylation, mRNA stability and translation initiation (reviewed in Sawicka et al. 2008).

More recently, purification of chromatin -associated proteins sumoylated by SUMO-1 confirmed sumoylation of several splicing factors, such as hnRNP A1, SF3A2 and SNRNP 200, during S phase of the cell cycle (Liu et al. 2015). In the same study, Scaffold Associated Factor-B (SAFB), known to interact with the CTD of RNAPII, was shown to be a SUMO-1 substrate that binds promoters of highly expressed genes. SUMO-1 and SAFB depletion resulted in a decrease of the splicing rate of mRNAs encoding ribosomal protein, suggesting a role for sumoylated SAFB in coupling transcription and RNA processing (Liu et al. 2015).

5 3′ End Processing

The poly(A) tail, found at the 3′ end of nearly all eukaryotic mRNAs , is important for transcript stability, transport into the cytoplasm and translation initiation. The 3′ ends of pre-mRNAs are formed in a two-step process, with an endonucleolytic cleavage generating a 3′ OH end, followed by synthesis of a poly(A) tail (reviewed by Colgan and Manley 1997; Proudfoot and O’Sullivan 2002). This apparently simple reaction requires a surprisingly complex set of factors (Shi and Manley 2015). The multisubunit cleavage/polyadenylation specificity factor (CPSF) and cleavage stimulatory factor (CstF) complexes define the poly(A) site by binding cooperatively to the conserved AAUAAA and GU-rich sequence elements upstream and downstream, respectively, of the cleavage site (Murthy and Manley 1992; Takagaki and Manley 2000; Kaufmann et al. 2004). Cleavage factors I (CFI) and II (CFII) help in complex assembly and in the first step (Takagaki et al. 1989; de Vries et al. 2000; Brown and Gilmartin 2003). The single-subunit enzyme poly(A) polymerase (PAP) catalyzes poly(A) addition and is also in most cases required in some way for the cleavage reaction (Raabe et al. 1991). Nuclear poly(A) binding protein helps in increasing the processivity of PAP and in elongating the poly(A) tail (reviewed by Kuhn and Wahle 2004).

PAP appears to play a significant role in the regulation of 3′ processing, and is subject to extensive modification. For example, multiple isoforms can be produced by alternative splicing (Zhao and Manley 1996) and the enzyme is post-translationally modified by phosphorylation and acetylation (e.g., Colgan et al. 1996; Shimazu et al. 2007). The cyclin-dependent kinase cdc2/cyclinB hyperphosphorylates PAP during mitosis and meiotic progression, thus downregulating PAP activity, which is important for normal cell growth (Colgan et al. 1996, 1998; Zhao and Manley 1998).

The RNAPII CTD participates in the 3′ end processing reaction and plays a critical stimulatory role (McCracken et al. 1997, Hirose and Manley 1998). 3′ processing factors are recruited from the promoter onwards throughout the length of the gene, dependent on the phosphorylation status of the CTD and a number of 3′ processing factors make direct contacts with the CTD (reviewed by Bentley 2005; Proudfoot 2004). The formation of 3′ ends is also closely coupled to transcription termination (reviewed by Buratowski 2005; Rosonina et al. 2006; Richard and Manley 2009).

Proteomic reports have identified several polyadenylation factors as putative SUMO substrates. A number of yeast polyadenylation factors were identified in independent proteomic screens. These include Ysh1 and Ydh1, the yeast homologs of the CPSF3 (aka CPSF-73) and CPSF2 (CPSF-100) subunits (Wykoff and O’Shea 2005, Panse et al. 2004, Wohlschlegel et al. 2004, Hannich et al. 2005). The poly(A) binding protein Pbp1, and the regulatory yeast factors Fir1 and Ref2, which interact with Pbp1 to regulate poly(A) tail length (Mangus et al. 2004), were also identified as SUMO targets (Panse et al. 2004; Hannich et al. 2005). In mammals, symplekin , a scaffolding protein that bridges CPSF-CstF complexes, was identified using an in vitro expression cloning approach of human cDNA library and validated as a SUMO substrate in vitro (Gocke et al. 2005). Symplekin was later found in several large-scale proteomics analysis to be sumoylated by SUMO-2 (see Table 2.1).

The first evidence that 3′ processing activity could be affected by sumoylation was obtained in yeast in 1996 (as the SUMO pathway was just being discovered). Specifically, the SUMO E1 enzyme uba2 was found to interact with Pap1 (del Olmo et al. 1997). Uba2 depletion from extracts was found to increase polyadenylation activity, suggesting that sumoylation is inhibitory to Pap1 activity. However, SUMO modification of Pap1 was not shown in this study (del Olmo et al. 1997), and a later study by the same group showed evidence of PAP being ubiquitinated but not sumoylated (Mizrahi and Moore 2000).

Subsequent more extensive studies of sumoylation of mammalian 3′ processing factors have provided evidence that SUMO is capable of modulating pre-mRNA 3′ processing and regulating the function of specific polyadenylation factors (Vethantham et al. 2007, 2008). The discovery of mammalian PAP sumoylation in western blots of mouse tissues and cell lines revealed a remarkable accumulation of higher molecular weight forms of PAP, which were found to reflect modification by the SUMO-2/3 isoforms (Vethantham et al. 2008). PAP proved to be an unusual substrate in displaying high levels of modification in specific tissues and cell lines and directly interacting with ubc9 , even though PAP lacks any consensus sumoylation sites. The sites of PAP sumoylation mapped to known regulatory region of the protein, and SUMO was indeed found to be crucial for PAP function. Sumoylation was required for correct nuclear localization, as mutating the sites of sumoylation, which overlapped with a nuclear localization signal , or overexpressing the SUMO protease SENP1, led to mislocalization of PAP in the cytoplasm . Depletion of ubc9 or overexpressing a SUMO protease resulted in decreased PAP levels, indicating that sumoylation promotes PAP stability. Finally, in vitro sumoylated PAP displayed lower poly(A) synthesis activity in polyadenylation assays. This study showed a profound effect of sumoylation on PAP function, and implicates SUMO as a major regulator of PAP activity. However, the physiological significance of this regulation, including the role of tissue-specific sumoylation, remains unknown.

A separate study examined sumoylation of the 3′ processing factors CPSF-73, the endonuclease, and symplekin , as well as the effect of sumoylation on 3′ processing activity (Vethantham et al. 2007). CPSF-73 is the most highly conserved of the 3′ processing factors, consistent with its role as the endonuclease that catalyzes the cleavage reaction. This was suggested first by its identification as a member of the metallo-β-lactamase family of Zn-dependent hydrolytic enzymes (Callebaut et al. 2002; Ryan et al. 2004). More conclusively, structural and biochemical studies with purified CPSF-73 provided unequivocal evidence that it indeed possesses endonucleolytic activity (Mandel et al. 2006). Symplekin was uncovered as a protein that bound strongly to both CstF and CPSF, and was proposed to function as a scaffolding factor (Takagaki and Manley 2000). Later studies implicated symplekin in the related processes of histone pre-mRNA 3′ end processing (Kolev and Steitz 2005) and cytoplasmic polyadenylation (Barnard et al. 2004).

Both CPSF-73 and symplekin , like PAP, are specifically modified by the SUMO-2/3 isoform. As with PAP , the sites of sumoylation mapped to potential regulatory regions. A siRNA knockdown/rescue experiment of symplekin revealed that a sumoylation-deficient mutant cannot rescue the cell death phenotype of the knockdown cells, indicating that sumoylation is required for normal function of symplekin . This study also examined the effect of sumoylation on 3′ processing activity in nuclear extracts. Desumoylation of nuclear extracts by SUMO protease or depletion of ubc9 had an inhibitory effect on 3′ processing activity, and the formation of specific 3′ processing complexes was blocked by SUMO protease treatment. This correlated with the specific interaction of the SUMO protease with CPSF-73 and symplekin , suggesting that the desumoylation of CPSF-73 and/or symplekin may be involved in this inhibition.

Other components of the polyadenylation machinery are also modified by SUMO (see Table 2.1). The CFI complex is an essential 3′ processing factor that binds pre-mRNAs upstream of the cleavage site and also functions in regulation of alternative polyadenylation (Kim et al. 2010; Gruber et al. 2012). CPSF7 (CFI-59), a component of CFI, has been identified as a chromatin -associated protein sumoylated by SUMO-1 (Liu et al. 2015) as well as SUMO-2 (Tammsalu et al. 2014). Among the other proteins directly involved in pre-mRNA 3′ processing, large-scale affinity purifications found CPSF-100, and CLP and PCF11, components of the CFII complex, as SUMO substrates that contain polySUMO chains (Golebiowski et al. 2009; Tammsalu et al. 2014; Hendriks et al. 2014; Schimmel et al. 2014; de Vries et al. 2000).

Several other proteins found in the massive, ~80 polypeptide 3′ processing “holo-complex” (Shi et al. 2009) have been identified as sumoylated in large scale proteomics (WDR33, RBBP6, SKIV2L2 (hMTR4), RBM25) (see Table 2.1). However, the significance and regulation leading to these modifications as well as their roles will need further investigation. The CPSF subunit WDR33 has recently been shown to bind directly the poly(A) signal AAUAAA (Chan et al. 2014; Schonemann et al. 2014). It will be interesting to test whether WDR33 sumoylation plays any role in this function.

The Polymerase II-Associated Factor complex (PAFc) is a conserved complex that plays multiple roles during transcription, including help couple transcription and 3′ processing. The tumor suppressor parafibromin (CDC73) is a PAFc subunit and plays a role in mRNA 3′ processing (Rozenblatt-Rosen et al. 2009). Interestingly, CDC73 sumoylation is upregulated after proteasome inhibition and affects CDC73 cellular localization. A more recent study using a highly sensitive strategy that detects SUMO remnant chains following tryptic digestion identified five sumoylation sites on CDC73, four of them also being ubiquitylation targets (Lamoliatte et al. 2014). Two other independent large scale analysis mapping SUMO-2 and SUMO-3 sites confirmed CDC73 sumoylation at multiple lysines for a total of seven identified sites so far (Tammsalu et al. 2014; Hendriks et al. 2014).

These studies on the role of SUMO in RNA 3′ processing are remarkable for the multiple, distinct effects that SUMO can have on individual factors and on the activity of the complex. The presence of multiple SUMO targets in the same complex and the known ability of SUMO to affect interactions and thus complex assembly raise the possibility that SUMO-mediated noncovalent interactions are necessary for efficient assembly of the polyadenylation complex. That SUMO could promote 3′ processing complex assembly and activity in nuclear extracts and yet inhibit enzymatic activity of purified PAP suggests that PAP activity alone and within the polyadenylation complex are distinct and highlights the complex nature of 3′ processing and its regulation by SUMO.

The effect of sumoylation on 3′ processing in mammals also contrasts with early findings in yeast. This is not altogether surprising considering that regulation of 3′ processing in yeast has not always correlated with that of mammalian systems. Although the SUMO target lysines in homologs of both CPSF-73 and symplekin are conserved in yeast, this is not the case for PAP. In fact, the C-terminal regulatory region of mammalian PAP is completely absent in yeast. Additionally, phosphorylation has different effects on 3′ processing in yeast and in mammals (He and Moore 2005; Ryan 2007; Colgan et al. 1996, 1998) and this seems likely to be the case for sumoylation as well.

6 Transcription Termination

Transcription termination is another important aspect of transcription that is coupled to 3′ processing (reviewed in Richard and Manley 2009). The DNA/RNA helicase Senataxin (SETX) has been shown to play an important role in this regulation by resolving DNA/RNA hybrids, known as R loops, which are formed behind elongating RNAPII and downstream of the poly(A) signal and 3′ pause sites of a subset of genes (Skourti-Stathaki et al. 2011, 2014). It is believed that after cleavage at the poly(A) site, unwinding R loops by SETX provides access to the 5′-3′ exonuclease Xrn2 to the downstream 3′ RNA leading to its degradation and subsequent RNAPII release from the DNA template (West et al. 2008). SETX was initially shown to be sumoylated by SUMO-1 and SUMO-2 in a yeast two-hybrid (Y2H) assay (Hecker et al. 2006). Subsequently, proteomic analyses of SUMO substrates revealed that SETX is highly sumoylated after heat shock , forming polySUMO chains, and also identified Xrn2 as another sumoylated protein (see Table 2.1) (Golebiowski et al. 2009; Bruderer et al. 2011). Additionally, two studies using the N-terminus of SETX as bait in an Y2H screen found Ubc9 and the E3 SUMO-protein ligase PIAS1 as SETX-interacting proteins (Richard et al. 2013; Bennett et al. 2013). Both screens also found an interaction with Rrp45 (EXOSC9), a component of the exosome that functions in RNA processing and degradation (Januszyk and Lima 2014). Importantly, interaction with Rrp45 was shown to depend on SETX sumoylation and SETX and Rrp45 co-localize in R-loop-dependent nuclear foci after induction of replication stress (Richard et al. 2013).

SETX function in R-loop resolution extends beyond its role in termination, and is likely relevant to the DNA damage response and to certain neurodegenerative diseases. Specifically, SETX is mutated in two distinct neurological disorders, a form of Amyotrophic Lateral Sclerosis known as ALS4 (Chen et al. 2004) and a form of ataxia named AOA2 (Ataxia with Oculomotor Apraxia type 2) (Moreira et al. 2004). Strikingly, three AOA2 mutations located in the SETX N-terminus abolished SETX sumoylation and Rrp45 interaction, while nearby ALS4 mutations did not (Richard et al. 2013). It has been proposed that replication stress leads to an increase in SETX sumoylation that recruits the exosome through its interaction with Rrp45 to sites of R-loop formation, preventing DNA damage accumulation and degrading unwanted RNAs (Richard and Manley 2014). Indeed, persistence of R loops leads to double strand breaks and genome instability (Santos-Pereira and Aguilera 2015). Consistent with this, a proteomics analysis identifying SUMO-2 targets in response to replication stress showed that indeed SETX sumoylation increases after replication stress (Bursomanno et al. 2015). Additionally, a large-scale quantitative proteomics screen examining sumoylation dynamics during cell cycle progression found SETX highly sumoylated in early S phase, S/G2 and G2/M (Schimmel et al. 2014). Since SETX nuclear foci form during S/G2 phase (Yuce and West 2013), it is very likely that SETX sumoylation helps to regulate its accumulation at stress foci. It is worth noting that a large throughput analysis showing that about 10% of human proteins might be sumoylated identified seven SUMO sites in SETX (Hendriks et al. 2014). Intriguingly, most of those lysines are in proximity of identified AOA2 mutations, consistent with the possibility that disruption of SETX sumoylation is directly linked to the disease. Together, these data indicate that SETX sumoylation plays a significant role in the DNA damage response during replication stress . However, important details of the underlying mechanism remain to be determined, and whether this modification plays a role during normal transcription termination remains unknown.

7 Sumoylation of hnRNPs

Heterogeneous nuclear RNA binding proteins (hnRNPs) are a structurally diverse group of RNA binding proteins that associate rapidly with nascent RNAs and contain auxiliary domains that bind other proteins (Krecic and Swanson 1999). While hnRNPs participate in a variety of processes such as mRNA biogenesis, telomere maintenance and initiation of translation , they are best known for their roles in regulation of RNA processing events, especially splicing , stabilization of mRNA and mRNA export. Some hnRNPs, such as hnRNP A1, are nucleocytoplasmic shuttling proteins, while others, such as hnRNP C, remain in the nucleus (reviewed by Dreyfuss et al. 2002, 1993; Martinez-Contreras et al. 2007).

HnRNPs are extensively sumoylated by both SUMO-1 and SUMO-2/3 (see Table 2.1) (Blomster et al. 2009; Hendriks et al. 2014). Vassileva and Matunis (2004) first demonstrated that hnRNPs C and M are targets for modification by SUMO. Moreover, the SUMO E3 ligase nup358/RanBP2 was found to enhance sumoylation of both classes of hnRNPs, indicating that the sumoylation of these hnRNPs very likely occurs at the nuclear pore complex (NPC). The SUMO acceptor lysine was further identified in hnRNP C and it was found that sumoylation can inhibit the RNA binding capacity of hnRNP C, as in vitro SUMO modified hnRNP C displayed a significantly lower affinity for ssDNA. Since both SUMO modification and demodification enzymes are localized at the NPC, the authors proposed a model whereby sumoylation regulates the organization of the mRNP complexes at the NPC and helps to facilitate nucleocytoplasmic transport . The hnRNP C and M proteins also have roles in regulating pre-mRNA splicing (Kafasla et al. 2002; Venables et al. 2008), which were not addressed in this study.

A number of other groups independently identified hnRNPs M, L and I as being sumoylated in proteomic analyses (Rosas-Acosta et al. 2005; Guo et al. 2005; Gocke et al. 2005). A study by Li et al. (2004) identified six hnRNPs, including A1, H1, U, F and K, in proteomic analysis of sumoylated proteins in human cell lines, and confirmed that hnRNPs A1, F and K were indeed sumoylated in vivo. Putative sumoylation consensus sequences were located in the RNA binding domains of these three proteins, raising the possibility that, as with hnRNP C, sumoylation may modulate the RNA binding function of these proteins. A long list of recent large-scale affinity-purifications of sumoylated proteins and high-resolution MS-based mapping of SUMO sites confirmed the sumoylation of most hnRNPs (see Table 2.1) as well as the polySUMOylation of a large number after heat shock (Bruderer et al. 2011).

8 Extending the Role of Sumo to mRNA Export

The export of mRNA across the NPC is closely linked to mRNA synthesis and maturation and requires that the mRNA is capped, spliced and polyadenylated. Transport of mRNA also generally requires the highly conserved export factor, Mex67 in yeast and NXF1/TAP in metazoans, which has been found sumoylated in several proteomics analysis (see Table 2.1). The hnRNP -like protein Yra1 or the Aly/REF complex enhances the affinity of mRNA to the export factor and shuttling hnRNP proteins like Npl3 act as additional adaptors by binding to Mex67 (reviewed in Rodriguez et al. 2004; Huang and Steitz 2005). UAP56/Sub2, which also functions in splicing , interacts closely with Aly/Yra1 and helps to couple splicing with export (Strasser and Hurt 2001). Both Sub2 and Yra1 are components of the TREX complex, which is recruited to the mRNA during transcriptional elongation through the THO complex (Tho2, Hpr1, Mft1, Thp2) (Chavez et al. 2000). Absence of the THO complex leads to the retaining of mRNPs in the nucleus (Dominguez-Sanchez et al. 2011; Strasser et al. 2002). It has been hypothesized that interaction of Yra1 with Mex67 displaces Sub2 at the NPC thus facilitating export (Strasser and Hurt 2001; Reed and Hurt 2002). Np13 is also recruited to the pre-mRNA during early elongation via interaction with RNAPII, providing another link of transcription to mRNA export (Lei et al. 2001). In addition, Npl3 has been shown to link 3′ processing with export (Gilbert and Guthrie 2004). The nuclear exosome, which physically interacts with the TREX complex, functions in mRNA surveillance to degrade unadenylated or unprocessed mRNA before export (reviewed in Rodriguez et al. 2004). Mlp1/Mlp2 in yeast functions in mRNA surveillance at the NPC prior to export to retain unspliced RNAs (Green et al. 2003; Galy et al. 2004).

It has recently been shown that sumoylation of the C-terminus of the THO complex component Hpr1 controls the association of the THO complex with mRNPs in a SUMO protease Ulp1-dependant manner (Bretes et al. 2014). While blocking Hrp1 sumoylation does not appear to affect mRNA export, it leads to improper mRNP assembly of a subset of stress -induced transcripts that are normally degraded by the exosome.

As described above, Vassileva and Matunis (2004) suggested a role for hnRNP sumoylation in influencing mRNA export in mammalian cells. Another study in Arabidopsis thaliana established an intriguing link between sumoylation and mRNA export involving Nua, the plant homolog of Mlp1/Mlp2, which in yeast serves as the anchor of hnRNPs (Green et al. 2003) and the SUMO protease Ulp1 (Zhang et al. 2002) at the NPC. The nua mutant shared striking similarities with another mutant, esd4 , which encodes the A. thaliana homolog of mammalian SENP2 and yeast Ulp1. nua or esd4 single mutants and nua/esd4 double mutants displayed altered expression of flowering regulators, an accumulation of SUMO conjugates and retention of poly(A) RNA in the nucleus, indicating that these proteins function in the same pathway (Xu et al. 2007).

Proteomic reports have identified key mRNA export and surveillance factors, including multiple subunits of the TREX complex, as putative sumoylation targets in yeast and mammals. These include Yra1 (Wohlschlegel et al. 2004), Npl3 (Denison et al. 2005), Rrp6 and Sub2, which were identified in multiple proteomic screens (see Table 2.1). The brief descriptions above, together with proteomic data, provide fodder to explore the role of SUMO in mRNA export. In addition, an active sumoylation machinery is known to exist at the NPC , and suggestive evidence that the sumoylation machinery at the NPC may be involved in mRNA export was obtained from data that ULP1 was identified as a high copy suppressor of a yra1 temperature sensitive strain (Kashyap et al. 2005). Sub2, Yra1 and Np13 connect export to transcription and processing events (reviewed in Luna et al. 2008), and the possibility that sumoylation may be involved in these interactions remains a very tempting target for future studies.

9 Sumo and RNA Editing

ADAR1 is an RNA editing enzyme that binds to double-stranded RNA and converts adenosine to inosine, which results in changes in amino acid coding and thus change in the protein sequence/function. Of the three isoforms of ADAR, ADAR1, 2 and 3, ADAR1 was found to be modified by SUMO-1 (Desterro et al. 2005). While SUMO did not influence the localization of ADAR1 in the nucleolus, it seemed to repress the RNA editing activity of the enzyme, as a sumoylation-deficient mutant was considerably more active in vivo and in vitro. In addition, sumoylation of ADAR1 in vitro resulted in inhibition of nonspecific RNA editing activity. The SUMO acceptor lysine was found to be located in a putative dimerization domain of the protein. ADAR heterodimers and homodimers have been shown to regulate activity and specificity of this enzyme. The authors hypothesized that by inhibiting dimerization, SUMO can regulate the activity and the function of ADAR1, and hence RNA editing. Several more recent MS-based proteomics data revealed that ADAR1 is also sumoylated by SUMO-2 (see Table 2.1) and is in fact a polySUMO-modified protein (Bruderer et al. 2011).

10 Conclusions

The events governing the processing of mRNA precursors are closely coupled to transcription, export and other nuclear events. SUMO has been known to be an important regulator of nuclear functions, including transcription, DNA repair and genome stability . The evidence discussed above, from many proteomic analyses and in some cases functional studies, points to an important role for SUMO in essentially all nuclear RNA processing and handling events. This is highlighted by the presence of multiple putative SUMO targets in functional capping , splicing , polyadenylation, termination and mRNA export complexes. So far the study of sumoylation of RNA processing/binding proteins has been largely performed in vitro. Considering the intricate connections of RNA processing to transcription and other events, a study of sumoylation of RNA processing factors remains incomplete until reliable in vivo processing assays are developed. The goal of understanding the roles of SUMO in mRNA metabolism holds a great deal of promise and much excitement, not only for elucidating mechanisms of basic cellular processes, but also for providing novel insights into human disease.