O-GlcNAc is a Universal Post-translation Modification in All Metazoans

O-GlcNAc-ylation is a post-translational modification (PTM) where an uncharged acetylated glucosamine is attached by glycosidic linkage to the hydroxyl group of serine and threonine residues.4 Though not as well-known as other PTMs, such as phosphorylation or ubiquitination, O-GlcNAc-ylation diverts approximately 5% of all glucose from cellular metabolism to the nutrient dependent hexosamine pathway.4,6 Being the nucleotide sugar substrate for O-GlcNAc-ylation, uridine diphosphate N-acetylglucosamine (UDP-GlcNAc) is attached to and removed from the loops and intrinsically disordered regions of approximately 4000 target proteins4 by O-GlcNAc transferase (OGT) and O-GlcNAcase (OGA) in eukaryotic cells.6,58 Splice variants of the OGT and OGA genes have been shown to sequester the enzymes to the cytoplasm, nucleus, and mitochondria.23,25,42 OGT is a highly conserved glycosyltransferase, retaining up to 80% of its identity in a variety of eukaryotic organisms. However, there is little sequence or structural similarities between OGT and other glycosyltransferases,36 suggesting an early evolutionary divergence and fundamental cellular function. Despite the dissimilarity with other glycosyltransferases, OGT does contain the common protein motif tetratricopeptide repeats (TPR), which is indicative of a regulatory complex36 and mediates protein–protein interactions.3 In general, OGT has a strong preference for intrinsically disordered regions possibly due to the fact that there are many binding pockets along the enzyme’s superhelical architecture composed of 11.5 TPR repeats. A subset of these pockets may be responsible for recognizing a single region on a substrate, which may not necessarily be structured in a distinct linear order. These intrinsically disordered regions are proposed to contort into a shape so that a sufficient number of its regions are recognized by the binding pockets on OGT.31

Extensive studies have been conducted to characterize a consensus sequence for O-GlcNAc-ylation. Approximately 1750 sites of O-GlcNAc-ylation have been experimentally determined in the murine synaptosome, in which a 19- and a 25-amino acid length sequence showed preference for O-GlcNAc-ylation with 439 and 130 occurrences respectively. Additionally, in a moderate number of O-GlcNAc-ylation sites, there is a proline two or three resides towards the amino-terminal or, less frequently, a valine one or three residues amino-terminal. Previously, PVXS/T had been suggested as a possible O-GlcNAc-ylation motif, however, this motif occurred at less than 20% of all identified sites.14,62 The above results might suggest the absence of a universally consensus motif.4,68 Another factor that makes it challenging to identify such a motif is the transience of the O-GlcNAc modification. Previous studies revealed that the turnover rate of the O-GlcNAc modification is much faster than that of the modified proteins, but still is not as fast as phosphorylation in signaling pathways.9,36,56,68

Many cellular processes are known to be targeted by both OGT and OGA, including higher order chromatin structure, histones, RNA polymerase II, transcription factors, lysosomal degradation, T cell activation, insulin signaling, glucose metabolism, and cell cycle progression.4,68 One of the most widely studied processes affected by O-GlcNAc-ylation is the cellular stress response. In model organisms, such as Drosophila, C. elegans, and D. rerio, intracellular O-GlcNAc levels have been shown to increase in response to oxidative stress, osmotic stress, ethanol, UV light, and heat shock6,34,52,53,69 as well as improve cell survival during the normal life cycle69 and during development.52 Furthermore, dysregulated O-GlcNAc-ylation has also been linked to diabetes, cardiovascular disease, neurodegenerative disorders, and cancer.5,8,11,38,39,48 For instance, upregulated O-GlcNAc-ylation resulting from chronically high glucose has been identified as a key player in insulin resistance, a main indicator of diabetes.24 It most likely interferes downstream of O-GlcNAcase46 in the insulin signaling cascade, resulting in increased O-GlcNAc-ylation of insulin receptor substrate 1 and beta-catenin,65 as well as in glucose metabolism, at the level of glycogen synthase.50

Due to the close relationship with glucose metabolism, it is likely that cellular nutrition status influences the level of O-GlcNAc-ylation of target proteins, such as histones and non-histone proteins like Polycomb and Trithorax which modify chromatin structure.7,18,21 In fact, OGT is a Polycomb group protein, originally characterized as sxc in D. melanogaster, whose chromatin targets coincide with Polycomb binding sites.57 Counterintuitively, glucose deprivation has also been shown to cause an increase in O-GlcNAc modification in HepG2 human liver cells,61 indicating not a linear relationship between O-GlcNAc and glucose but rather a homeostatic relationship where O-GlcNAc-ylation compensates for disruptions in glucose abundance. Further complicating this relationship is the fact that there is a class of proteins that form the majority of the nuclear pore complex (NPC) that appear to be O-GlcNAc-ylated to some degree regardless of glucose abundance.41,70 They would therefore not be good nutrition sensors without extremely sensitive tools and most likely serve some other function in addition to the metabolism of glucose.

The NPC is a Gatekeeper Between the Nuclear and Cytoplasmic Compartments

The NPC is a 120–125 MDa1,2 protein complex structure composed of approximately 30 unique nucleoporins (Nups) in multiple copies embedded in the nuclear envelope of eukaryotic cells. The NPC allows for efficient trafficking, up to 1000 transport events per second,54 of soluble proteins and RNAs between the cytoplasmic and nuclear compartments.1 With eight-fold radial symmetry,1 the NPC possesses an hourglass-shaped central channel with ~50 nm in diameter at its narrowest width1 and ~200 nm in total length.2 The central channel of NPC is filled with flexible intrinsically disordered Nups containing many phenylalanine-glycine (FG) repeats.13 Critically, these FG Nups form a selectivity barrier allowing for the passive diffusion of signal-independent cargo (<40 kDa)1 and the facilitated transport of signal-dependent cargo by the assistance of transport receptors. Both transport modes could account for the translocation of various proteins, mRNPs, snRNPs, pre-ribosomal subunits, and tRNAs between the nucleus and cytoplasm in eukaryotic cells.1 Unfortunately, it is still unclear specifically how the FG Nups establish the NPC’s selectivity barrier. It appears that hydrophobic interactions between the FG Nups are required to make it energetically unfavorable for large passively diffusing molecules to pass through the NPC.55 While, if large cargos contain a nuclear localization sequence or nuclear export signal, they will be recognized by transport receptors and these transport receptors can directly interact with FG Nups to complete the nucleocytoplasmic transport for these cargos.1

Glycosylation in the NPC

Based on early immunostaining and electron microscopy studies, the FG Nups in the central scaffold of the NPC were revealed to be heavily O-GlcNAc-ylated.15,27,41 Soon after, protein purification and mass spectrometry studies further showed that all FG Nups,41 and also some structural Nups were O-GlcNAc modified in the NPC.41,66 It is believed that fundamentally O-GlcNAc-ylation is necessary in the NPC to prevent degradation of modified Nups.70 Evidently, inhibition or knock out of OGT caused the loss of commonly O-GlcNAc-ylated Nup93 and several FG Nups, including Nup62, Nup153, Nup214, and Nup358, from the NPC by increasing their ubiquitination and degradation.70 The fundamental mechanism suggested is that O-GlcNAc could prevent degradation by disrupting inappropriate aggregation of the Nups.70 Lending credibility to this conclusion is the fact that in vitro experiments showed that a hydrogel composed of O-GlcNAc-ylated Nup98 was devoid of amyloid-like beta structures, which were characteristic of Nup-hydrogels without O-GlcNAc.37 Consequently, loss or inhibition of OGT and the subsequent loss of Nups led to an inability of NPCs to exclude 70-kDa dextran molecules, and a decrease in the initial rate of facilitated transport as well.70 Similar results were obtained in studies of X. laevis nuclei depleted of O-GlcNAc-ylated FG Nups.29 Another early study by Finlay et al. focused on facilitated transport through NPCs that were devoid of O-GlcNAc-ylated Nups. They showed that cargos facilitated by transport receptors could traverse these modified NPCs, but with a much longer time compared to untreated NPCs.16 It is clear that O-GlcNAc is necessary for proper NPC permeability, however it is still difficult to discern whether these results are from the loss of O-GlcNAc or the loss of Nups caused by inhibiting or knocking out OGT. The former would lead to a more condensed and rigid FG-Nup barrier while the latter would result in a less crowded and leaky NPC channel.

Other studies have been conducted to probe the relationship between O-GlcNAc-ylation and NPC permeability.17,27,54,55,60 Labokha et al. confirmed that O-GlcNAc-ylated Nup98 hydrogels efficiently admitted freely diffusing molecules smaller than 40-kDa, while excluding those larger than 40-kDa. In contrast, non-O-GlcNAc-ylated Nup98-hydrogels excluded all passively diffusing molecules regardless of size.37 Also, O-GlcNAc-depleted Nup98-hydrogels almost completely inhibited the penetration of a major transport receptor, Importin β1.37

A “Balance” Model for the Roles of O-GlcNAc in the NPC

A critical question raised by the above in vitro facilitated transport studies is: is O-GlcNAc directly involved in the specific hydrophobic interactions between FG Nups and transport receptors? Interestingly, other studies showed that modification of FG Nups with a disaccharide, O-linked N-acetyllactosamine (O-LacNAc), instead of O-GlcNAc, had no effect on the permeability or selectivity capabilities of NPCs.26,47 So it seems to suggest that the function of O-GlcNAc in the NPC could be less of a specific interaction but more to confer some qualitative characteristic to the FG Nups, namely, increasing the FG Nups’ fluidity which could prevent them from congealing through hydrophobic forces. Very recently, single-molecule tracking and super-resolution three-dimensional mapping of passive diffusion and facilitated transport through the NPC have revealed that distinct spatial transport routes adopted by these two transport modes and the heterogeneous spatial distribution of FG domains inside the NPC.43,45 Therefore, we propose a “balance” model for the roles of O-GlcNAc in the native NPC’s permeability and selectivity as follows (Fig. 1). Since the FG Nups are anchored in the wall of the NPC with a rotational symmetry and their FG domains expand inward, passive diffusion of small molecules occurs down the central axial channel where the FG domains are most flexible and sparsely populated. On the other hand, facilitated transport mainly happens in the peripheral regions around the central axial channel where the FG Nups are more dense and structured due to hydrophobic forces. Both of these transport routes are realized by the FG-Nup barrier in a configuration that is not too dense or too sparse, which could be a result of cooperation between FG-FG hydrophobic interactions and O-GlcNAc-ylation of FG Nups (Fig. 1a). In a situation where there is too little O-GlcNAc-ylation of FG Nups in the NPC, the FG-Nup selectivity barrier might be altered or even collapsed to form a more rigid beta-amyloid structure, which will reduce their steric hindrance of the passive diffusion of signal-independent small molecules and shield hydrophobic interactions for facilitated transport of signal-dependent cargos through the NPC (Fig. 1b).

Figure 1
figure 1

Model of O-GlcNAc’s role in the NPC. (a) O-GlcNAc-ylation (orange dots) of the NPC prevents excessive hydrophobic interactions between the FG Nups (black curves). This weakens the hydrophobic interactions enough for the FG domains to expand, which also allows the passive diffusion (yellow arrow) of signal-independent small molecules (<40 kDa) and exposes FG repeats for interactions with karyopherins during facilitated transport (light blue arrow). (b) If FG Nups in the NPC are too little O-GlcNAc-ylated, they bind to each other tightly through hydrophobic interactions and collapse into a tight secondary structure. The passive diffusion channel is widened admitting larger molecules (>40 kDa) while facilitated transport is prevented due to a lack of exposed FG repeats.

This model could be used to further understand how the cell upregulates mass flow between the nucleus and cytoplasm, for example, in the transcription factor re-localization during the oxidative stress response.33 Recently, it has been shown that several critical FG Nups involved in facilitated transport, including Nup62, Nup214 and Nup153, were heavily O-GlcNAc-ylated in response to oxidative stress.12,35 Crampton et al. also concluded that there was net inhibition of Crm1-mediated nuclear export following oxidative stress, suggesting some key interactions in the NPC were altered. Specifically, the Crm1/Nup62 and Crm1/Nup153 interactions were both strengthened while the interactions between FG Nups were weakened due to oxidative stress followed by O-GlcNAc-ylation.12 As shown in our model, a balance between O-GlcNAc-ylation of FG Nups and FG-FG hydrophobic interactions is critical for cells to handle oxidative stress.

Approaches for the Detection of O-GlcNAc in the NPC

As is the case with many fields of biological research, technological advancement in the tools used to study O-GlcNAc is one of the main impediments to progress.24 Early contributions to the field of O-GlcNAc biology especially as it pertains to the NPC were made primarily using affinity labeling techniques coupled with mass spectrometry for identification or electron microscopy for spatial localization.19,22,51,59 Wheat germ agglutinin (WGA) was often used as the affinity reagent22,51 due to its higher affinity for O-GlcNAc residues vs. sialic acid residues.22 However, an array of monoclonal antibodies (mAbs), with more specificity against single O-GlcNAc-ylated proteins, have since replaced WGA to avoid WGA’s non-selective affinity and binding for multiple O-GlcNAc residues.28,49,52,53,59 Electron microscopy studies have shown that these mAb-reactive Nups primarily reside on the nuclear and cytoplasmic faces of the NPC.59

Another popular method of O-GlcNAc identification is a combination of metabolic and biochemical approaches. Since there is flexibility in the type of substrates that OGT can use, O-linked sugars bound to a chemically reactive azide, alkyne, or diazirine group can be incorporated by OGT post-translationally into proteins. The reactive groups can then be chemically modified in several ways such as attaching a fluorophore for visualization,10,32,64 an affinity agent such as biotin for purification,32 or photocrosslinking the diazirine with nearby molecules to determine novel interactions by mass spectrometry.41 Several of these “click” chemistry approaches, termed as such because of their ease of use and speed, have offered femtomolar levels of sensitivity10 and high throughput volumes.32

Also by utilizing click chemistry labeling, super-resolution fluorescence microscopy, direct Stochastic Optical Reconstruction Microscopy (dSTORM)63 in particular, has been adapted to study the O-GlcNAc distribution on the plasma membrane with high localization precision. It was found that O-GlcNAc distributed homogenously across the plasma membrane in contrast to the view that glycosylated nanodomains exist on the membrane.40 Based on the localization precision, it is promising that super-resolution microscopy could be adapted to study O-GlcNAc labeling at the level of the NPC and determine its in vivo localization.

There are several bioinformatics tools available to predict O-GlcNAc-ylation sites.44 An early approach called YinOYang extracted amino acid sequence features from 40 experimentally determined O-GlcNAc sites and found that the serine and threonine residues modified with O-GlcNAc are often in close proximity to proline and valine, distant from leucine and glutamine, and upstream of other serines.20 As more experimentally determined O-GlcNAc sites were found, O-GlcNAcScan was developed which used a training data set of about 400 positive and 30,000 negative sites of O-GlcNAc-ylation to determine potential amino acid sequences whose serine/threonine residues are O-GlcNAc-ylated.67 While these two approaches have accurately identified O-GlcNAc-ylation sites, they are limited because no consensus site has been determined and the negative training data set is comprised of every other serine and threonine residue that have not been experimentally shown to be O-GlcNAc-ylated. There will inevitably be sites in this data set that have yet to be identified as positive.30 Recently, a newer approach called O-GlcNAcPRED, which uses a more efficient algorithm for amino acid sequence feature extraction and another parameter that looks at the physiochemical properties of the that amino acid sequence, has been shown to outperform both previous predictors.30 It appears that there might actually be no consensus sequence for O-GlcNAc-ylation, but rather a combination of amino acid sequence and biochemical context which marks a site for O-GlcNAc-ylation.

Conclusion

The O-GlcNAc modification is an essential and evolutionarily conserved post-translational modification that has housekeeping duties and functions as a component of the cellular stress response. At the level of the NPC, it may maintain the fluidity of FG Nups which promotes steric interference of passively diffusing molecules while exposing FG Nups for interactions with transport receptors in facilitated transport. With further development of new tools to study O-GlcNAc-ylation, it will be possible to distinguish the full complement of responsibilities O-GlcNAc has in the NPCs and beyond.