Keywords

Introduction

Protein glycosylation is a protein modification in which one or more sugar residues are attached to amino acids, typically Asn, Thr, Ser, or Trp (Spiro 2002). Such modifications are present in all domains of life. Protein glycosylation is important for protein folding, stability, activity, binding, and secretion, and the mutation of the enzymes involved in the process may result in severely deleterious phenotypes or diseases in humans (Helenius and Aebi 2004; Haeuptle and Hennet 2009).

Protein glycosylation is catalyzed by glycosyltransferases (GTs), which transfer the sugar from a donor to an acceptor. In protein glycosylation, the acceptor is either the protein or a sugar already attached to the protein. The sugar donors are usually activated nucleotide sugars or lipid phosphate-linked sugars (Lairson et al. 2008). In forming the glycosidic bond between sugar and acceptor, there are two types of glycosylation mechanisms, either inverting or retaining, which refer to the inversion or retention of the stereochemistry of the anomeric carbon of the transferred sugar (Lairson et al. 2008). Inverting glycosyltransferases perform a simple SN2 type reaction: the enzyme catalyzes the nucleophilic attack of oxygen, nitrogen, carbon, or sulfur on the anomeric carbon with simultaneous cleavage of the bond to the phosphate-containing leaving group (Lairson et al. 2008; Chang et al. 2011). Retaining glycosyltransferases use an SNi-like reaction having an oxocarbenium-phosphate ion pair intermediate and an interaction between the leaving group and the nucleophile on the same face of the sugar (Hurtado-Guerrero and Davies 2012; Breton et al. 2012; Lairson et al. 2008; Yu et al. 2015; Lee et al. 2011). This mechanism is predicted for all retaining GTs (Hurtado-Guerrero and Davies 2012; Breton et al. 2012; Lairson et al. 2008).

Based on their structural fold, glycosyltransferases fall into one of three superfamilies: GT-A, GT-B, or GT-C (Lairson et al. 2008). GT-A enzymes form a domain composed of a seven-stranded β-sheet surrounded by α-helices and a small, two-stranded β-sheet that is sometimes described as two “closely abutted” Rossmann-like domains (Lairson et al. 2008; Chang et al. 2011). GT-B enzymes form two Rossmann-like domains, each with a 6- or 7-stranded β-sheet surrounded by α-helices; these domains are separated by a flexible linker with the active site at the cleft between the domains (Chang et al. 2011; Lairson et al. 2008). GT-As are typically metal-dependent, having a conserved DxD motif involved in divalent cation binding, while GT-Bs are typically metal-independent (Lairson et al. 2008; Chang et al. 2011). The metal or a positively charged residue stabilizes the negatively charged leaving group, which universally contains a phosphate (Lairson et al. 2008). The GT-C enzymes are all membrane proteins predicted to containing from 8 to 13 transmembrane helices (Lairson et al. 2008). The known GT-C enzymes all use lipid phosphate-linked sugar donors, which make sense considering their hydrophobic properties. All known GT-Cs use an inverting mechanism, while GT-As and GT-Bs can be either inverting or retaining (Lairson et al. 2008; Chang et al. 2011). Because the structural study of membrane proteins is difficult, GT-Cs have been less well characterized until recently.

Relative to the other two families, GT-C has far fewer members. For example, there are over 200 glycosyltransferases in humans, but GT-C accounts for only about 10%. GT-C enzymes are mainly involved in N-glycosylation, glypiation (addition of glycosylphosphatidylinositol; GPI anchor), O-mannosylation, and C-mannosylation of proteins. Some GT-C enzymes have been found to be involved in lipid modifications. GT-C members mainly belong to 14 families within the Carbohydrate Active enZyme (CAZy) database. Until now, structural work has been reported for only three CAZy families: GT39, GT66, and GT83, which catalyze the protein O-mannosylation, protein N-glycosylaton, and lipid A glycosylation, respectively (Table 6.1). Among them, the enzymes that catalyze the transfer of an oligosaccharide to Asn in N-glycosylation are the most well studied. The structures of oligosaccharyltransferase (OST) from several species (bacterial PglB, archaeal AglB, the yeast OST complex, and human OST) have been determined (Lizak et al. 2011; Matsumoto et al. 2013; Wild et al. 2018; Bai et al. 2018; Ramirez et al. 2019). The structure of the yeast PMT complex that catalyzes the transfer of mannose to Ser/Thr (Bai et al. 2019) and the structure of the bacterial ArnT that catalyzes the attachment of the cationic sugar 4-amino-4-deoxy-L-arabinose to lipid A have also been published (Petrou et al. 2016). Most recently, the structure of a mycobacterial lipid glycosyltransferase, arabinofuranosyltransferase D (AftD), has been described (Tan et al. 2019). Interestingly, although these enzymes have totally different functions, their core structures share the same fold. This fact may indicate all GT-C enzymes share a common fold, as GT-A and GT-B members do, although this still needs to be confirmed.

Table 6.1 Major known GT-C fold glycosyltransferases

This chapter focuses on advances in the structural understanding of the GT-Cs superfamily catalyzing protein glycosylation, over the past two years. We will introduce the structures of bacteria PglB, archaeal AglB, yeast and mammalian OSTs, and yeast PMT, and discuss their contribution to our understanding of the mechanisms and selectivity of protein glycosylation.

The Prokaryotic Protein N-glycosyltransferases: PglB and AglB

PglB and AglB are homologs of the catalytic subunit of the eukaryotic oligosaccharyltransferase Stt3. They are present in eubacteria and archaea, respectively, and they glycosylate Asn (Matsumoto et al. 2013; Lizak et al. 2011). While eukaryotic OST acts on the luminal face of the ER membrane, these prokaryotic homologs act on the periplasmic face of the plasma membrane (Helenius and Aebi 2004; Matsumoto et al. 2013; Lizak et al. 2011). Most structural studies have focused on PglB from Campylobacter lari (Lizak et al. 2011; Napiorkowska et al. 2018, 2017), so this protein will be the focus of this section, unless otherwise noted. Like other members of the GT-C superfamily, PglB is an integral membrane protein (Lizak et al. 2011; Liu and Mushegian 2003). Crystal structures of this enzyme revealed that it has thirteen transmembrane helices (TMHs) with elongated periplasmic loops between TMH1 and 2 (EL1) and TMH9 and 10 (EL5) as well as a globular C-terminal periplasmic domain (Fig. 6.1a) (Lizak et al. 2011). EL1 is primarily involved in binding the periplasmic domain, while EL5 is flexible and contributes to binding to the periplasmic domain and the donor and acceptor substrates. The flexibility of EL5 is likely important for substrate binding and product release. Correspondingly, the formation of a disulfide bond between EL5 and the transmembrane domain decreases activity for the natural lipid-linked oligosaccharide (LLO) substrate but not for a smaller synthetic LLO having only a single sugar.

Fig. 6.1
figure 1

Structure of the prokaryotic protein N-glycosyltransferase PglB. a PglB bound to an acceptor peptide and an inhibitory LLO (PDB ID 5OGL), colored by region: blue for the TM region, yellow for EL1, orange for EL5, and green for the periplasmic domain. Substrate carbons are gray and the Mn atom is pink. The areas in the dashed black and red rectangles are enlarged in panel D and E, respectively. b, c Electrostatic surface of the substrate binding site of PglB in top b and side c views. Binding of the acceptor Asn and the LLO on opposite sides of the, enzyme with red representing negative charge, blue representing positive charge, and white representing neutral. d Residues involved in recognition of the acceptor peptide NxT consensus site (PDB ID 5OGL). e Residues involved in binding the LLO (PDB ID 5OGL)

PglB and other OSTs transfer an oligosaccharide from an isoprenoid lipid carrier to the nitrogen of an Asn side chain (Helenius and Aebi 2004). The acceptor peptide and donor LLO substrates bind between the periplasmic and transmembrane domains, and their binding pockets are connected by a small channel that the acceptor Asn side chain pokes through (Fig. 6.1b, c) (Lizak et al. 2011; Napiorkowska et al. 2018, 2017). The catalytic residues reside on the LLO side of the enzyme while the residues that recognize the S/T of the NxS/T glycosylation consensus site (and, in bacteria, the D/E of the longer D/ExNxS/T consensus site) are located on the opposite side of the channel.

Crystal structures of PglB have been determined in complex with peptide only, peptide and inhibitory LLO, and LLO and inhibitory peptide substrates (Lizak et al. 2011; Napiorkowska et al. 2018, 2017). These structures revealed that the conserved WWD motif recognizes the hydroxyl of the acceptor S/T with three hydrogen bonds and that the Thr methyl has a further van der Waals interaction with Ile (Fig. 6.1d). In the active site, a divalent cation (here Mn2+) is coordinated by Asp and Glu residues contributed by EL1, the short periplasmic loop EL2, and EL5 (Fig. 6.1a, d). At least one of these residues also interacts with the acceptor Asn (Fig. 6.1d). The LLO pyrophosphate is bound by the transmembrane domain with two contributions from EL5, and the N-acetylglucosamine (GlcNAc) at the base of the transferred oligosaccharide is bound by a conserved residue in the periplasmic domain and the metal-coordinating Asp residue from EL1 (Fig. 6.1e). In the catalytically competent binding conformation, the pyrophosphate may coordinate the divalent cation.

These structures provide clues about the catalytic mechanism. PglB and other OST catalytic subunits transfer the oligosaccharide with an inversion of the stereochemistry of the anomeric carbon bonded to the lipid carrier. Therefore, the Asn side chain nitrogen likely attacks the anomeric carbon by an SN2-type mechanism (Lizak et al. 2011). However, the amide nitrogen is typically a poor nucleophile, because the nitrogen’s lone pair is conjugated with the carbonyl and thus has pi bond character. One suggested mechanism involves twisting the amide to break this double-bond character and free the lone pair (Lizak et al. 2011). Another proposed mechanism focuses on the transformation of the anomeric carbon into a reactive electrophile via the electron-withdrawing action of the divalent cation, which is near (4.1 Å) the glycosidic oxygen in the structure of PglB bound to reactive LLO and inhibitory peptide (Napiorkowska et al. 2018). So far, no experimental evidence strongly favors either of these mechanisms.

Eukaryotic Protein N-glycosyltransferase: The OST Complexes

Unlike the prokaryotic OST that has a single catalytic subunit, the yeast OST is composed of eight integral membrane proteins (Wild et al. 2018; Bai et al. 2018). The yeast OST has two isoforms, Ost3, which associates with the Sec61 translocon, and Ost6, which associates with the Ssh1 translocon (Yan and Lennarz 2005). The structures of the Saccharomyces cerevisiae OST complex containing Ost3 has been determined by cryo-EM both solubilized in detergent (Bai et al. 2018) and reconstituted in nanodiscs (Wild et al. 2018) (Fig. 6.2a, b). The structures are quite similar, with a root mean squared deviation (RMSD) of only 1.8 Å (Bai and Li 2019). In agreement with previous biochemical data, the structures showed that OST is divided into three subcomplexes that pack loosely in their transmembrane regions, and several ordered lipids were observed that appear to hold the OST complex together (Fig. 6.2a, b) (Bai et al. 2018; Wild et al. 2018; Mueller et al. 2015; Karaoglu et al. 1997). The three subcomplexes comprise Ost1 and Ost5; the catalytic subunit Stt3, Ost3, and Ost4; and Ost2, Wbp1, and Swp1 (Karaoglu et al. 1997; Mueller et al. 2015; Bai et al. 2018; Wild et al. 2018).

Fig. 6.2
figure 2

Cryo-EM structure of the S. cerevisiae OST complex. The eight-protein complex is colored by subunit (PDB ID 6C26). Lipid and glycan carbons are gray and red, respectively. a, b Front and back views, with the luminal domains at the top. c Overlay of Stt3 (blue) and PglB (colored as in Fig. 6.1). d Residues involved in recognition of the acceptor peptide NxT and the donor DLO (PDB ID 5OGL and 6C26). Residues in blue are from Stt3, and those in black from PglB

Although we know the atomic structures of the OST complex, the functions of many of the proteins are not well understood. The only protein with a well-established function is Stt3, the catalytic subunit. A comparison between PglB and Stt3 shows that the overall fold and catalytic residues are well conserved (Fig. 6.2c, d) (Wild et al. 2018; Bai et al. 2018), so their catalytic mechanisms are most likely the same. The luminal domains of Stt3, PglB, and AglB are more divergent than the transmembrane domains, with different insertions into a conserved core fold (Matsumoto et al. 2013).

We have clues about the roles of the other subunits, however, based on the structures of the yeast OST and of the mammalian OST in complex with the translocon and ribosome (Pfeffer et al. 2014; Braunger et al. 2018), as well as data from functional studies. Docking of the yeast OST structures into the cryo-ET map of the mammalian ribosome–translocon–OST supercomplex has indicated that Ost3 (or Ost6 in the other yeast isoform) directly interacts with the translocon, which agrees with previous data showing that Ost3 and Ost6 bind the Sec61 and Ssh1 translocons, respectively (Wild et al. 2018; Bai et al. 2018; Yan and Lennarz 2005). In addition, a phospholipid bound in a putative LLO-binding hydrophobic groove between Stt3 and Ost3 may indicate a role for Ost3 in LLO binding (Bai et al. 2018). The flexibility of Stt3 TMH9 and Ost3 TMH1, which were the only transmembrane helices not resolved in the complex, may also indicate a role in LLO binding (Bai et al. 2018; Wild et al. 2018). Finally, the luminal domain of Ost3, which was also not resolved, is known to be an oxidoreductase homologous to thioredoxin, and it is likely well positioned to help feed the nascent peptide from the Sec61 channel to the Stt3 active site (Bai et al. 2018; Schulz et al. 2009).

Other proteins in the complex appear to have primarily structural or scaffolding roles. Ost2 attaches the rest of its subcomplex to the transmembrane domain of Stt3 (Bai et al. 2018; Wild et al. 2018). Ost4 binds and likely stabilizes the transmembrane domain of Stt3, while Ost5 binds and likely stabilizes Ost1 (Wild et al. 2018; Bai et al. 2018). Other important elements for maintaining the integrity of the complex include the elongated C-terminal tail of Stt3 and the ordered glycan bonded to N539 of Stt3, both of which contact Wbp1 and Swp1 (Bai et al. 2018; Wild et al. 2018).

Finally, comparison to structures of known function allows speculation about the functions of the luminal domains of Ost1, Wbp1, and Swp1. On the basis of structural similarity to a noncatalytic domain of an aminopeptidase, the Ost1 luminal domains were proposed to capture the glycosylated peptide product to prevent reentry into the Stt3 active site (Bai et al. 2018). In addition, structural similarity suggested that the luminal domains of Wbp1 and Swp1 have a role in binding the LLO glycan (Bai et al. 2018). Further biochemical and structural studies are needed to determine the functions of these subunits and to elucidate the mechanisms of glycan binding and specificity.

Unlike yeast, mammals have two forms of the catalytic Stt3 subunit (STT3A and STT3B), leading to two OST complexes, OST-A and OST-B (Ruiz-Canada et al. 2009). OST-A binds the translocon and glycosylates peptides as they pass into the ER; OST-B acts as a proof-reader for NxS/T sites missed by OST-A (Ruiz-Canada et al. 2009; Shrimal et al. 2017; Cherepanova et al. 2014). A near-atomic-resolution structure of the mammalian OST-A bound to the Sec61 translocon and the ribosome showed that OST docks to the translocon 6.5 nm from the Sec61 lateral gate that releases transmembrane helices into the ER membrane (Braunger et al. 2018). This explained why NxS/T sites are often missed when they are near transmembrane helices (Braunger et al. 2018; Nilsson and von Heijne 1993). In mammalian OST-A and OST-B, Ost3/6 are replaced by DC2 and KCP2 or by TUSC3 and MAGT1, respectively (Shrimal et al. 2017; Cherepanova et al. 2014). DC2 was found to mediate the binding of STT3A to the translocon, and the helices of STT3A that interfaced with DC2 are different in STT3B (Braunger et al. 2018). Moreover, ribophorin-I (homologous to yeast Ost1) binds the ribosome via a C-terminal cytosolic domain. The most recent cryo-EM structures of human OST-A and OST-B reveal that they are highly similar (RMSD of less than 1 Å). The ribosome-interacting C-terminal domain of ribophorin-I is an ordered four-helix bundle in isoform A, but the corresponding domain in isoform B is disordered, explaining why only OST-A binds to the ribosome for co-translational protein N-glycosylation (Ramirez et al. 2019).

Eukaryotic Protein O-mannosyltransferases: Pmt1 and Pmt2

Pmt proteins are GT-C superfamily glycosyltransferases that attach mannose to Ser or Thr (Loibl and Strahl 2013). Like OST, Pmt proteins act at the luminal face of the ER membrane and transfer mannose from a lipid carrier (Loibl and Strahl 2013; Helenius and Aebi 2004). Pmt proteins can be separated into three classes: PMT1, PMT2, and PMT4 (Loibl and Strahl 2013; Neubert and Strahl 2016). Animals encode only a single member each from the PMT2 and PMT4 classes and no PMT1 (Loibl and Strahl 2013). In yeast, PMT1 enzymes form heterodimers with PMT2 enzymes, and PMT4 enzymes homodimerize, but in animals PMT2 (POMT2) dimerizes with PMT4 (POMT1) (Loibl and Strahl 2013; Neubert and Strahl 2016). Recently, the atomic resolution structure of the heterodimer of Pmt1 and Pmt2 from Saccharomyces was solved by cryo-EM (Fig. 6.3a) (Bai et al. 2019).

Fig. 6.3
figure 3

Cryo-EM structure and the catalytic site of the S. cerevisiae Pmt1–Pmt2 complex. Subunits are colored by region. Pmt1 and Pmt2 are sky blue and forest green, respectively. Carbons of donor DLO and acceptor peptide are orange and red, respectively. a Model of the Pmt1–Pmt2 heterodimer (PDB ID 6P25). b Overlay of Pmt1 and Pmt2. The area in the dashed black rectangle are enlarged in panel C. c Residues involved in binding acceptor peptide and dol-P product. d Overlay of the Pmt1 and PglB TM regions

The structures of Pmt1 and Pmt2 are highly similar, with their transmembrane domains overlaying with a RMSD of 1.8 Å and their lumenal MIR domains (domain present in mannosyltransferase, inositol triphosphate and ryanodine receptor) overlaying with a RMSD of 1.6 Å. The transmembrane domains each comprise 11 transmembrane helices with elongated luminal loops between TMH1 and 2 (LL1) and between TMH7 and 8 (LL4) (Fig. 6.3a). LL4 contains the MIR domain in each subunit, and these domains form a β-trefoil. The MIR domain of Pmt1 interacts with LL4 of Pmt2, but the MIR domain of Pmt2 was detached and was disordered in the cryo-EM map. The structure of this domain was solved by X-ray crystallography and docked into the cryo-EM map (Bai et al. 2019). Interactions between the subunits occur primarily in the cytosolic and luminal regions, leaving a large gap between Pmt1 and Pmt2 in the transmembrane region, which may allow facile diffusion of the lipid-linked mannose into the active sites. The attachment of the Pmt1 MIR domain to Pmt2 may block entry on this side. The functional importance of the MIR domains remains to be established, but they may bind to chaperones (Loibl and Strahl 2013).

The structure of the heterodimer was solved with and without a tetrapeptide acceptor substrate (PYTV, where T is the glycosylated Thr) bound to Pmt2 (Bai et al. 2019). In both structures, an elongated electron density was associated with Pmt1 and modeled as dolichol phosphate (dol-P), the lipid carrier. Overlaying the structures of Pmt1 and Pmt2 allowed the ternary complex of either to be modeled and revealed the residues involved in binding the dol-P phosphate as well as the acceptor Thr (Fig. 6.3b, c). Importantly, an invariant DE motif was observed to be involved in acceptor binding, with the Asp positioned to activate the Thr nucleophile and Glu stabilizing the position of the motif by formation of a salt bridge with a conserved Arg.

The Pmt1 and Pmt2 structures also overlaid reasonably well with PglB and Stt3 (Fig. 6.3d), particularly in the transmembrane domains (RMSD of 3.2 Å between Pmt1 and PglB), even though PglB and Stt3 have two more transmembrane helices (Bai et al. 2019). In addition, the membrane proximal helices in EL1 of PglB are similar to those in LL1 of Pmt1 and Pmt2, and the membrane proximal helix in LL4 of Pmt1 and Pmt2 is similar to the helix in EL5 of PglB when LLO was bound. The MIR domains of Pmt1 and Pmt2 and the C-terminal periplasmic domain of PglB are not similar. Comparison to PglB showed that the metal-binding residues of PglB are conserved in Pmt1/2, though no metal was observed bound to either subunit of the heterodimer. Moreover, the substrates of Pmt1/2 are bound similarly to those of PglB. Because Pmt1, Pmt2, and PglB are all inverting glycosyltransferases in the GT-C superfamily, the structural and mechanistic similarities are not unexpected. However, the nature of the Asn nucleophile of PglB may require specialized activation, as mentioned above (Lizak et al. 2011).

Summary

While structural information about the GT-C superfamily has long been lacking in comparison to the GT-A and GT-B superfamilies of glycosyltransferases, improvements in the crystallography of membrane proteins and recent improvements in the resolution routinely accessible by cryo-EM have begun to close that gap. PglB/AglB/Stt3 and Pmt1/2 represent two GT-C families (https://www.cazy.org). Determining the structures of GT-Cs from other families will allow us to identify elements that are conserved across all GT-Cs and elements specific to certain families. Understanding these similarities and differences may aid in the development of drugs specific to particular GT-Cs present in human pathogens or implicated in human diseases. The structures discussed here have greatly improved our understanding of the mechanism of protein glycosylation, though questions remain. In particular, the functions of numerous noncatalytic subunits and domains present in the OST complex have yet to be firmly established.