Keywords

1.1 The Sumo Proteins

Over two decades ago, a small cellular protein of 12 kDa, with 18% homology to the well-known ubiquitin protein, was co-discovered and termed Small Ubiquitin-like MOdifier or SUMO. SUMO was independently identified by four groups in 1996: Freemont’s group found it as a small ubiquitin -like protein associated with PML in an interacting complex and called it PIC1 (Boddy et al. 1996), Chen’s group identified it in a two-yeast hybrid screen of proteins associated with cellular DNA repair proteins (Shen et al. 1996), Yeh’s group identified it as a small modifier associated with Fas which they called sentrin (Okura et al. 1996), and Blobel’s group discovered that RanGAP was modified by a small ubiquitin-like protein which they designated GMP1 (Matunis et al. 1996). These modifiers were all the same protein that is now commonly referred to as SUMO .

SUMO is conserved from yeast to mammalian cells though the number of SUMO genes varies greatly (Chen et al. 1998). The budding yeast, Saccharomyces cerevisiae , possesses only one SUMO gene, Smt 3, whose protein product shares 48% identity and 75% similarity with the mammalian SUMO1 (Huang et al. 2004). Likewise, both Drosophila melanogaster (Lehembre et al. 2000) and Caenorhabditis elegans (Jones et al. 2002) each have a single SUMO gene. In contrast, plants express 8 SUMOs (Kurepa et al. 2003) and vertebrates have 4 SUMOs. There are four different genes in the human genome coding for the different SUMO modifiers, SUMO1, 2, 3, and 4. SUMO2 and 3 share about 92% identity but they only related to SUMO1 at 48% identity (Kamitani et al. 1998a). While SUMO1, 2, and 3 are expressed in all tissues tested, SUMO4 transcription is restricted primarily to the kidneys , lymph nodes and spleen (Bohren et al. 2004). SUMO4 has been less studied than the others, but seems to play a role in diabetes (see Chap. 18) and stress response (Wei et al. 2008). SUMO1 is a 12 kDa protein of 101 amino acids that is related in structure and in sequence to the 9 kDa ubiquitin protein, as both modifiers share ~18% primary structure identity to each other and have 48% similarity in their three-dimensional structure (Bayer et al. 1998). Ubiquitin is only a 76 amino acid polypeptide, and the difference between those two modifiers mainly resides in the extended N-terminal structure of SUMO as this extension is absent in ubiquitin.

At the tertiary level, the basic structures have been solved for SUMO1 (Bayer et al. 1998), SUMO2 (Huang et al. 2004), and SUMO 3 (Ding et al. 2005). All three SUMOs share a central compact, globular domain with the characteristic ββαββαβ ubiquitin fold. The SUMOs also each have both N- and C-terminal extensions, with the N-terminal extension being much longer than for ubiquitin. Within this extension in SUMO2/3 is a lysine at position 11 that can itself be conjugated with SUMO to yield SUMO2/3 chains (Tatham et al. 2001). SUMO1 lacks a suitable lysine for conjugation and does not appear to form chains in vivo, though in vitro chain formation has been observed (Yang et al. 2006a). The biological role and function(s) of the N-terminal extension are not well understood, but the C-terminal extension is important for direct contact with the SUMO activating enzyme, SAE1/2 (Lois and Lima 2005).

One of the ongoing questions about the SUMOs is the functional difference between the SUMO1 and SUMO2/3 families. Certain biological variations have already been identified, including different responses to environmental conditions (Saitoh and Hinchey 2000; Manza et al. 2004; Deyrieux et al. 2007), different susceptibilities to various SUMO proteases (Gong and Yeh 2006; Mikolajczyk et al. 2007), and differences in subcellular localization and abundance (Saitoh and Hinchey 2000; Manza et al. 2004; Ayaydin and Dasso 2004). The substrate pool for these two SUMO groups is also different with some substrates capable of being modified by either SUMO1 or SUMO2/3, and other substrates showing a clear preference for one or the other SUMO type (Saitoh and Hinchey 2000; Rosas-Acosta et al. 2005; Vertegaal et al. 2006; Citro and Chiocca 2013). While SUMO preference differences exist for individual substrates, in general for both the SUMO1 and SUMO2/3 modified proteins, the substrates are predominantly nuclear and are often involved in regulation of nucleic acid structure and function. Just how biologically important this demarcation in the substrate preference is remains unclear as SUMO1 knockout mouse studies have suggested that SUMO2/3 can compensate for absent SUMO1 (Evdokimov et al. 2008; Zhang et al. 2008), suggesting considerable redundancy between the SUMO paralogs. However, more recently it was shown that SUMO2 is essential during mouse embryonic development, while SUMO3 was dispensable (Wang et al. 2014), indicating that there are suitable functional differences even between these nearly identical paralogs. Interestingly, it was previously shown that SUMO3 can be phosphorylated at serine 2, while SUMO2 cannot be phosphorylated since it has an alanine at this position (Matic et al. 2008). This observation suggests one basis for functional, regulatory, or substrate preference differences between the highly identical SUMO2 and SUMO3 proteins could be related to differences in their own post-translational modification. Much additional work is needed to clarify the common and distinct roles of the various SUMO proteins.

1.2 The Enzymology of Sumoylation

Sumoylation is the enzymatic activity which results in the covalent attachment of SUMO to a large number of proteins, including cellular and viral proteins. This multi-step enzymatic process (Fig. 1.1) includes a heterodimeric activating enzyme, SAE1/2 , a monomeric conjugating enzyme, Ubc9 , and multiple ligases and isopeptidases (Wilson 2004). SUMOs are translated as precursor forms which are initially processed by specific isopeptidases (SENPs ) to remove C-terminal residues and generate a mature SUMO , terminating with a C-terminal diglycine (Johnson et al. 1997). Interestingly, SUMO4 has a proline residue at position 90 that prevents this processing by the SENPs (Owerbach et al. 2005) and instead it is processed only under stress conditions by a stress-induced hydrolase (Wei et al. 2008). The mature forms of SUMOs then interact with the SUMO E1 E1 activating enzyme, SAE1/2 . SAE1 is a 346 amino acid polypeptide while SAE2 is 640 amino acids and contains the catalytic cysteine at residue 173; the SUMOs interact exclusively with the SAE2 subunit. The SAE2 subunit also contains a nuclear localization signal that may contribute to the enrichment of sumoylation components in the nucleus. Together, the SAE1 and 2 proteins form an U-shaped heterodimer complex with a large groove that has the ATP-binding motif at the base of the groove (Lois and Lima 2005). Binding of SUMOs to SAE2 positions the SUMO diglycine motif for adenylation, then the activated SUMO can be covalently attached to the catalytic cysteine via a thioester linkage.

Fig. 1.1
figure 1

Representation of the enzymatic cascades leading to the covalent attachment of SUMO to a substrate protein. The SUMO enzymes are the SENP isopeptidase, the SAE1/SAE2 activating enzyme, the Ubc9 conjugating enzyme, and the SUMO ligases. Attachment of SUMO to SAE2 and Ubc9 is via a thioester linkage to a cysteine residue in the enzymes. SUMO attached to the substrate is via a lysine residue to form a stable isopeptide bond

Subsequent to formation of the SAE1/2 -SUMO complex, the activating enzyme transfers SUMO to SUMO E2 conjugating enzyme, Ubc9 . Unlike the ubiquitin pathway that contains many E2 enzymes, Ubc9 is the sole conjugating enzyme for SUMO and functions with all 4 SUMOs. Once again, there is a conserved domain motif [αβββββ(ββ)ααα] common to all E2 enzymes known as the ubc superfold (Tong et al. 1997). Within this domain is the catalytic groove that contains the active site cysteine, amino acid 93. Binding of SAE1/2 to Ubc9 allows transfer of the SUMO C-terminus to cysteine 93, again through formation of a thioester linkage, and the structural contexts of the SAE2-Ubc9 interaction are highly conserved across species (Wang et al. 2010). Lastly, Ubc9 transfers SUMO to the substrate protein, where SUMO is covalently linked to a lysine residue through an isopeptide bond between the epsilon amino group of the lysine and the carboxyl group of the C-terminal glycine on SUMO.

The lysine residue utilized for sumoylation commonly falls in the ΨKxE/D motif, where Ψ is a hydrophobic residue (typically Val, Ile, Leu, Met, or Phe), K is the target lysine, x is any amino acid, and the fourth position is an acidic residue (Hay 2005). However, even early studies of sumoylated proteins found that not all were modified at lysines in sequence contexts that match the consensus motif, indicating that alternative sequence features could also specify a particular lysine for SUMO modification (Kamitani et al. 1998b; Rangasamy et al. 2000; Hoege et al. 2002). Subsequently, numerous proteomics approaches have identified hundreds of sumoylated proteins and characterized the SUMO addition sites in many of these substrates, revealing a site selection complexity much greater than the original consensus motif. Zhou et al. used a proteomics approach and found that five of the ten sumoylation sites determined for yeast proteins were in non-canonical sequences (Zhou et al. 2004). Similarly, Chung et al. examined SUMO2 conjugation sites for in vitro sumoylated proteins and found that half the identified sumoylation sites (three of six) where in sequences which did not conform to the ΨKxE/D motif (Chung et al. 2004). These and similar studies confirmed that while the ΨKxE/D motif is often associated with SUMO addition, only about half the identified SUMO substrates have the original consensus motif (Matic et al. 2010). In some cases sumoylation appears fairly promiscuous with many lysines in the substrate capable of serving a SUMO acceptors (Eladad et al. 2005; Chymkowitch et al. 2015; Gonzalez-Prieto et al. 2015), especially after mutation of the predominant SUMO target(s). In these cases, the substrate typically has a SUMO -interacting motif (SIM ; see below) that recruits the sumoylation machinery (Chang et al. 2011; Meulmeester et al. 2008). However, more commonly these other SUMO acceptor lysines fall within alternative SUMO conjugation motifs, including the inverted (E/DxKΨ) motif (Matic et al. 2010), the hydrophobic (ΨΨΨKxE) motif (Matic et al. 2010), the phosphorylation-dependent (PDSM ; ΨKxExxSPP) motif (Hietakangas et al. 2006), the negatively charged amino acid-dependent (ΨKxExxEEEE) motif (Yang et al. 2006b), the phosphorylated (ΨKxSPP) motif (Picard et al. 2012), and the extended phosphorylation (ΨKxSPPSPxxxSPP) motif (Picard et al. 2012). Collectively, this array of motifs helps explain the large number of lysines capable of being sumoylated and may contribute to paralog specific modification differences for individual substrates.

Unlike ubiquitinylation, which absolutely requires an E3 ubiquitin ligase for transfer of ubiquitin to the substrate, sumoylation occurs readily in vitro without a ligase requirement (Melchior 2000). Nonetheless, several SUMO ligases have now been identified, including SP-RING type ligases such as the PIAS family (Johnson and Gupta 2001) and MMS21 (Potts and Yu 2005). Members of this family share sequence homology with the RING domain of ubiquitin RING ligases. The SP-RING domain directly interacts with Ubc9 , inducing a conformational change that enhances transfer of SUMO from Ubc9 to the substrate (Rytinki et al. 2009). Additional identified SUMO ligases include RanBP2 (Pichler et al. 2002), Pc2 (Kagey et al. 2003), and TOPORS (Weger et al. 2005), as well as a few other proteins that appear to facilitate sumoylation but whose mechanisms are poorly defined. Given that there are roughly 600 ubiquitin ligase genes in the human genome (Deshaies and Joazeiro 2009), it is quite likely that many more SUMO ligases remain to be identified. Generally, all these SUMO ligases enhance sumoylation both in vitro and in vivo, and influence substrate selection (Gareau and Lima 2010). For instance, PIAS acts as a SUMO ligase, preferentially targeting the tumor suppressor p53 , c-Jun, STAT1, and the nuclear androgen receptor AR (Schmidt and Muller 2002; Ungureanu et al. 2003; Sachdev et al. 2001). RanBP2 stimulates sumoylation of the promyelocytic leukemia protein (PML) , the nuclear body SP100 protein, and the histone deacetylase HDAC4 (Pichler et al. 2002), while Pc2 is the unique E3 ligase for the transcriptional factor co-repressor CtBP (Kagey et al. 2003). In addition to enhancing the overall sumoylation reaction and substrate selection, these ligases likely also influence preferential utilization of the SUMO paralogs.

The SENPs , the SUMO isopeptidases , play a dual role; they are involved in the maturation of SUMO and in the de-conjugation of SUMO from its target proteins (Hang and Dasso 2002; Gong et al. 2000). There are 6 SENPs that function with SUMO, 1–3 and 5–7 (there is no SENP 4, and SENP 8 is a Nedd 8 protease). In mammalian cells these enzymes are differentially located, with SENP1 located at the PML bodies, SENP6 in the cytoplasm , SENP3 in the nucleolus, and SENP2 at the nuclear pore complexes (Gong and Yeh 2006). Therefore, it appears that de-sumoylation of conjugates is possible at different subcellular locations, and access of individual substrates to specific SENPs may provide an additional level of regulation. Additionally, specific functional differences have been observed among the 6 SENPs regarding their maturation and deconjugation activities. While SENP1 and SENP2 can generally process all the SUMOs 1–3 precursors (Nayak and Muller 2014), SENP5 preferentially processes the SUMO3 precursor (Di Bacco et al. 2006). With regard to deconjugation, SENP1 functions primarily with SUMO1 conjugates (Sharma et al. 2013), while the other SENPs strongly prefer SUMO2/3 substrates. Additionally, SENP6 and SENP7 are most adapt at disassembly of SUMO2/3 chains (Lima and Reverter 2008; Drag et al. 2008). Deletion of the SENP genes, like deletion of Ubc9 in yeast, stops cell cycle progression and further highlights that reversible sumoylation is an essential and critical function in the cell life cycle (Li and Hochstrasser 1999). Overall, the diversity and specificity of SENPs undoubtedly helps regulate the dynamic and reversible sumoylation process.

1.3 Sumoylation Functions

Functionally, sumoylation is a more diverse modifier than ubiquitin . Unlike ubiquitinylation, which has a major role of targeting proteins for proteasome degradation, addition of the SUMO moiety does not directly target proteins to the proteasome. Instead, there are examples of substrates where sumoylation blocks proteosomal degradation by competing with ubiquitinylation for a common lysine residue (Desterro et al. 1998; Klenk et al. 2006; Escobar-Ramirez et al. 2015). Since over 25% of the SUMO sites in human proteins are known ubiquitination sites (Hendriks et al. 2014), regulation of degradation through such competition may be more common than anticipated. Intriguingly, lysine residues are also targets for modification by acetylation and methylation, so sumoylation may also be competing with those events to regulate protein activity as has been shown for Iĸβα (Desterro et al. 1998), delta-lactoferrin (Escobar-Ramirez et al. 2015), and STAT5 (Van Nguyen et al. 2012).

Further cross-talk between the SUMO and ubiquitin systems is mediated by SUMO-Targeted ubiquitin ligases (STUbls ) (Xie et al. 2007; Prudden et al. 2007; Sun et al. 2007; Uzunova et al. 2007). This novel class of ubiquitin ligases functions by specifically interacting with SUMO moieties on sumoylated proteins, thereby causing ubiquitination and subsequent degradation (Perry et al. 2008). This interaction is depends on SUMO-interacting motifs (SIMs ) present on the STUbls . The canonical SIM is a hydrophobic motif with the consensus V/I-x-V/I-V/I (Song et al. 2004; Hecker et al. 2006), and the interaction between the SIM and the SUMO is through a β strand of the SIM and the β2 strand of SUMO (Sekiyama et al. 2008; Namanja et al. 2012). Both of the human STUbls, RNF 4 , RNF4 and RNF111, contain at least 3 SIM motifs, so they preferentially target proteins with poly-SUMO signals, either multiple SUMO moieties or SUMO chains (Tatham et al. 2008; Erker et al. 2013). Lastly, there is at least one example of a viral protein whose stability is indirectly tied to sumoylation levels (Wu et al. 2009). Through an undefined mechanism, the stability of the human papillomavirus E2 E2 protein is greatly enhanced when overall sumoylation levels increase, suggesting that further examples of cross-talk between the ubiquitin and SUMO pathways await discovery, and that these two systems may have an even richer interplay than currently imagined (see Chap. 6).

In contrast to its modest role in protein stability, it is now clear that SUMO has a major role in transcriptional regulation (see Chaps. 2 and 3), both through direct modification of individual transcription factors and co-factors (Verger et al. 2003; Garcia-Dominguez and Reyes 2009), and through chromatin remodeling (Cubenas-Potts and Matunis 2013). For most transcription factors, sumoylation reduces their transactivation capacity, though enhanced transcriptional activity has also been demonstrated for a few substrates, including heat shock factors (Goodson et al. 2001; Hong et al. 2001), Oct4 (Wei et al. 2007), and Smad4 on some promoters (Long et al. 2004). The negative transcriptional effects can be due to changes in transcription factor stability and/or subcellular localization, particularly the recruitment of sumoylated transcription factors into PML nuclear bodies as has been observed for HIPK2 (Kim et al. 1999), Sp3 (Ross et al. 2002), NACC1 (Tatemichi et al. 2015), and other proteins (Sahin et al. 2014). Alternatively, SUMO modification can have more global effects on transcription by affecting chromatin remodeling. Examples are plentiful of sumoylation facilitating the recruitment and/or modification of various remodeling enzymes including histone deacetylases (HDACs) (Wagner et al. 2015; de la Vega et al. 2012; Citro and Chiocca 2013; Girdwood et al. 2003; Yang et al. 2003), histone demethylases demethylase s (Huang et al. 2016; Bueno and Richard 2013), and methyltransferases (Spektor et al. 2011; Lee and Muller 2009; Riising et al. 2008), as well as directly modifying histones (Shiio and Eisenman 2003; Nathan et al. 2006; Zheng et al. 2015; Dhall et al. 2014). Clearly, all of these mechanisms could be reversed by desumoylation with SENPs , leading to dynamic and controllable effects on transcription of individual or groups of genes. Thus, sumoylation effects on transcriptional activity would reflect the overall dynamics of sumoylation/desumoylation that may vary with cell cycle , cell growth conditions, and disease state.

In addition to regulating transcriptional activity, sumoylation also has an important regulatory role for other nuclear functions, including RNA processing (see Chap. 2), genome maintenance and repair (see Chaps. 4 and 5), and nucleocytoplasmic transport (see Chap. 7). More recently, non-nuclear functions of sumoylation have been identified (Wasik and Filipek 2014), and Chaps. 8 and 9 will explore the role of SUMOs in regulating ion channel activity and metabolic pathways, respectively. Because of this pleiotropic ability to modify numerous proteins and affect transcriptional activity or cellular environment on a global scale, sumoylation in now recognized as a regulatory process involved in mitosis (Chap. 10), meiosis (Chap. 11), differentiation and development (Chap. 12), and senescence (Chap. 13). While much of the focus in the sumoylation field is on vertebrates, sumoylation is equally important for plants (Chap. 14) and invertebrates (Chap. 15). Much progress has been made in recent years in understanding the roles of sumoylation in these diverse areas of cell biology, particularly through global proteomics efforts (Tammsalu et al. 2015; Eifler and Vertegaal 2015; Hendriks et al. 2015a; Xiao et al. 2015; Yang and Paschen 2015), but much work remains, and for most of these processes there still are many more questions than answers.

One recently emerging theme that likely contributes to the ability of sumoylation to control the cellular processes mentioned above is the coordinate modification of functionally related groups of protein in response to specific stimuli (Jentsch and Psakhye 2013; Raman et al. 2013). Overall increases in cellular sumoylation levels have long been seen in response to various kinds of stress (the SUMO stress response, SSR) (Zhou et al. 2004; Manza et al. 2004; Tempe et al. 2008), and more recent studies are now revealing that much of this increased sumoylation is associated with networked proteins (Lewicki et al. 2015; Castro et al. 2012; Hendriks et al. 2015b; Xiao et al. 2015). For example, DNA damage has been shown to elicit the sumoylation of numerous proteins in the homologous recombination system (Psakhye and Jentsch 2012). Many of the proteins in these networks contain SIM motifs, so increased sumoylation would likely contribute to enhanced interactions and stability of these multi-protein complexes. Thus, by subtle interplay of sumoylation and desumoylation these protein complexes and functional pathway could be fine-tuned to produce rapid and appropriate levels of response to changing cellular conditions. Interestingly, at least in vertebrates, it is SUMO2/3 that are mostly involved in SSR, and the intracellular pools of free SUMO2/3 are rapidly lost after exposure to stress -inducing agents as SUMO2 and SUMO3 becomes largely conjugated to their substrates.

Lastly, given the breadth of SUMO modified targets and the critical pathways involved, it is not surprising that dysregulation of the SUMO system can contribute to disease states. Increasing evidence links over or under expression of various sumoylation components to diseases as diverse as neurodegeneration (Chap. 16), cancer (Chap. 17), diabetes (Chap. 18), craniofacial disorders (Chap. 19), and vascular disease (Chap. 20). It is also now apparent that utilization and/or modulation of the host sumoylation system are an important aspect of many infection diseases, both viral (Chap. 21) and bacterial (Chap. 22). This emerging recognition of a role for sumoylation in disease and infection is exciting as it may ultimately offer new insights for diagnosis, therapeutics, and prevention. The next several years should bring exciting new insight into the role of sumoylation, not only in fundamental cellular processes, but also in applications to understanding and managing disease states.

1.4 Conclusion

In the 20 plus years since its discovery, SUMO has gone from an obscure and functionally unknown protein to one that is recognized as a key regulator of multiple nuclear and cytoplasm ic events. The principal components of this modification system have been identified, their basic structures elucidated, and the general features of their enzymology understood. Thanks to the combination of individual targeted protein studies and more global proteomics approaches, hundreds of sumoylation targets are now known, providing a rich resource for subsequent functional studies. The sumoylation system has been shown to be an important player in many biological processes, such as cellular differentiation, transcriptional regulation, and cell growth (Deyrieux et al. 2007; Gill 2005; Ihara et al. 2007). Perturbing this biological system changes the cellular response to diverse signaling pathways (Sharrocks 2006) and likely leads to disease. In the chapters that follow, the role of sumoylation in a variety of cellular processes will be explored. The focus will range from effects on molecular targets through cell processes to the organismal level. While many exciting questions remain unanswered, by spanning from molecules to multicellular systems, the full impact and profound significance of the sumoylation system should become apparent. We hope that both newcomers to this field, as well as veterans, will find this comprehensive compilation of state-of-the-art reviews on current sumoylation topics useful and insightful.