Keywords

1 Introduction

Protein glycosylation is the most complex and diverse form of post-translational modification leading to the formation of N-, O-, S- and C-glycosides, phosphoglycans and glypiated proteins (proteins that are covalently bonded to a glycosylphosphatidylinositol via their C-terminus) [1]. The most common glycan-protein linkages found in nature are formed using either the side chain amide nitrogen of asparagine (Asn) residues or the side chain of serine (Ser) and threonine (Thr) residues to afford N-linked or O-linked glycoproteins, respectively  [2]. Examples of C-mannosylation via the indole C-2 carbon atom of tryptophan (Trp) and S-glycosylation using the thiol of cysteine (Cys) are rare and have also been described in the literature (Fig. 1)  [3].

Fig. 1
figure 1

Selected examples of N-, O-, C- and S-glycan-peptide linkages found in nature

Glycosylation plays an important role in various biological processes including cell development  [4], inflammation  [5], cell–cell signalling, adhesion and immune responses  [6]. The presence of sugar moieties affects glycoprotein tertiary structures  [7, 8], facilitates folding  [9] and improves proteolytic stability  [9]. Altered glycosylation patterns affect the circulatory lifetime of glycoproteins  [10] and are associated with numerous diseases including some cases of congenital disorders  [11, 12], leukocyte adhesion deficiency II  [1], the aetiology of diabetes  [13] and neurodegeneration  [14, 15], cancer  [16, 17] and Alzheimer’s disease  [18, 19].

Glycosylation of peptides and proteins with the intention of improving the pharmacokinetic profile of protein-based drugs has resulted in rapid expansion of the therapeutic peptide and protein market  [20,21,22,23,24,25,26]. Increased proteolytic stability has been achieved by glycosylation of glucagon-like peptide-1 (GLP-1)  [22], insulin  [27], exendin-4  [28] and interferon-β  [29]. The composition of protein-bound glycans can modulate the efficacy of protein therapeutics, for example, it has been demonstrated on numerous occasions that multiply sialylated versions of erythropoietin (EPO)  [30] possess longer plasma half-lifes as compared to their asialylated counterparts  [10].

Robust and general strategies to prepare glycopeptides and glycoproteins in pure forms are highly sought after and have been investigated by many research groups  [31,32,33,34,35,36,37,38,39,40,41,42]. N-Linked glycoproteins are prevalent and their glycans encompass diverse structures. They have gained significant scientific attention, not least due to their potential as therapeutic agents and thus are the main focus of this mini-review which describes the use of enzymatic glycosylation to access N-linked glycopeptides  [24, 43].

2 Synthesis of Glycopeptides in Nature

The biosynthetic pathways of glycopeptide and glycoprotein synthesis in mammals are complex  [2]. Briefly, the initial stage involves scavenging a simple monosaccharide unit, mostly glucose (Glc) but also galactose (Gal), mannose (Man) and glucosamine (GlcN) from the bloodstream by cells throughout the body using protein transporters (sodium-dependent co-transporters, SGLT and sodium independent facilitative transporters, GLUT) located in the plasma membrane of various tissues  [2]. This is followed by intracellular de novo synthesis of additional sugar-based building blocks including fucose (Fuc), N-acetyl neuraminic acid (Neu, sialic acid) and N-acetylgalactosamine (GalNAc) by chemical processes including epimerisation, condensation and acetylation  [2]. Subsequent phosphorylation of the monosaccharide units and pairing with corresponding nucleotides takes place in the cytosol to afford energy-rich nucleotide sugars required for further glycan synthesis  [2]. Organelle-specific transporter proteins traffic nucleotide sugars from the cytosol into the endoplasmic reticulum (ER) and Golgi lumens where assembly of glycans of particular structures takes place. This complex process is mediated via the action of numerous transmembrane glycosyltransferases and glycosidases, the precise mechanisms of which are not yet fully understood  [2, 24].

N-Linked glycosylation starts in ER where a dolichol phosphate oligosaccharide [Glc3Man9(GlcNAc)2] is formed and then transferred en block onto an Asn residue on a nascently translated protein via the action of oligosaccharyltransferase (OST)  [2]. The glycosylated asparagine residue is invariable within the conserved Asn-X-Thr/Ser peptide sequence (where X is any amino acid residue except proline) of the unfolded protein. The oligosaccharide is then processed by the enzymes glucosidase I and II, affording a truncated GlcMan9(GlcNAc)2 glycoprotein interacting with calnexin and calreticulin and participating in the primary ‘quality control’ system distinguishing native from non-native protein conformations  [44]. Subsequent removal of the Glc unit by glucosidase II releases Man9(GlcNAc)2-tagged glycoprotein from the chaperone, which if correctly folded can leave the ER. In case the protein exhibits non-native conformation, association with calnexin and calreticulin is renewed (via GlcMan9(GlcNAc)2 unit containing reattached Glc residue) and ‘quality control’ process is repeated until proper protein conformation is achieved. Glycoproteins permanently misfolded are eliminated from ER for degradation  [44].

Removal of a terminal α(1-2)-linked mannose unit from either of the two arms of Man9(GlcNAc)2 subsequently takes place and is mediated by mannosidase I or II affording a Man8(GlcNAc)2-bound protein that is then transported to the cis-Golgi apparatus for further processing. Within the Golgi lumen, the common intermediate Man5(GlcNAc)2 is formed by the action of mannosidase IA and IB to remove α(1-2)-linked mannoses. Man5(GlcNAc)2 is then used to assemble, complex and hybrid subclasses of N-glycosides  [2]. The partially processed glycans which were not trimmed to Man5(GlcNAc)2, or those which escape remodelling process from Man5(GlcNAc)2 to complex and hybrid N-glycoproteins, fall into high-mannose subclass of N-linked glycoproteins of the type Man(5–9)GlcNAc2 (Fig. 2)  [45].

Fig. 2
figure 2

High-mannose, complex and hybrid subclasses of N-glycosides containing the core Man3(GlcNAc)2 pentasaccharide  [46]

O-Linked glycosylation occurs in the Golgi apparatus and starts with the attachment of a monosaccharide unit (most often GalNAc) to a Ser or Thr residue present within the sequence of an already folded protein  [2]. The subsequent formation of more complex oligosaccharides structures from the Ser/Thr(GalNAc) (Tn-antigen) core is then achieved via the sequential action of a variety of glycosyltransferases  [2].

Protein glycosylation, unlike pure protein and oligonucleotide synthesis, is not template mediated and so depends on the activity and concentration of sugar substrates, the structural and conformational properties of glycosylation sites and the differential activities of numerous enzymes  [41]. This means that glycoproteins are invariably produced as heterogeneous mixtures, termed glycoforms, where proteins possessing the same peptide chain vary in glycan structure  [42, 47] and which may also vary in site occupancy. Access to well-defined homogeneous glycoproteins for subsequent structural and functional studies is, therefore, a challenging task.

Literature methods to prepare homogeneous glycopeptides and glycoproteins include the use of recombinant technology, fully synthetic techniques using chemical methods, enzymatic approaches and combinations of all the above. Each of these methodologies has its own advantages and limitations, and in depth discussion and progress to date has been summarised by a number of recent reviews  [32, 33, 35, 36, 41, 42]. Our own endeavours towards the synthesis of glycosylated peptides and proteins with natural and non-natural glycan-peptide linkages began in 2008 and were mainly focused on using fully synthetic techniques  [48,49,50,51,52,53,54,55,56]. Our current interest is now directed at combining synthetic techniques with chemoenzymatic methods to achieve the convergent synthesis of complex glycopeptides,  [57, 58] and this is the primary focus of the current report. A comprehensive review on the topic, comprising elegant examples of chemoenzymatic approaches by other research groups, has been recently published by Wang and Amin  [33].

3 Recombinant Approach to Access Glycopeptides and Glycoproteins

Recombinant methods using either fungal, plant or insect-based systems to produce N-glycoproteins seem promising but suffer from the limitation that heterogeneous products are invariably obtained, as well as the differences in glycosylation patterns between species, and batch-to-batch variability  [24]. Expression systems based on bacteria are restricted to the synthesis of non-glycosylated proteins only (insulin, for example) owing to their inability to glycosylate due to the absence of glycosylation machinery  [24]. Up to this point in time mammalian cell lines, for example, typically from Chinese hamster ovary (CHO), have been extensively used for the production of therapeutic glycoproteins due to their potential to produce certain human-like glycosylation patterns  [24]. Other mammalian systems used for the synthesis of N-glycoproteins include baby hamster kidney (BHK-21) and murine myeloma (NS0 and Sp2/0) cells. Their use, however, is limited; BHK cell lines, similarly to CHO cells are not able to produce α(2,6)–linked terminal sialic acids present in human glycans, and immunogenity concerns are associated with the use of murine myeloma cells [24]. Nevertheless, recombinant methods using mammalian cell lines are routinely used to produce marketed therapeutic monoclonal antibodies [59]. Adalimumab (Humira®), the worlds best selling drug in 2015  [60] used for the treatment of rheumatoid arthritis  [61], is expressed in CHO cell lines; Golimumab (Simponi®), used as an immunosuppressant  [62], is expressed in Sp2/0 cells  [59].

Human cell lines derived from embryonic kidney (HEK293), embryonic retinoblasts (PER.C6) or hybrid HKB11 cell lines composed of embryonic kidney cells (293S) and modified Burkitt’s lymphoma cells (2B8) are attractive but expensive alternatives to CHO cells  [63]. Significant scientific focus has therefore been directed into engineering effective and more straightforward yeast-based systems  [64, 65]. It was found that ‘humanized’ Pichia pastoris-derived cell lines can be used to express homogeneous human N-linked glycoproteins bearing truncated complex [(GlcNAc)2Man3(GlcNAc)2] units  [65] and full-length complex sialylated glycans  [64]. These studies have opened up further possibilities to access therapeutic glycoproteins via recombinant techniques using alternative yeast-derived expression systems [64, 65]. However, complete control of glycosylation, i.e. in order to obtain strictly homogeneous glycoproteins via recombinant methods is still challenging  [66].

4 The Use of Chemical Synthesis to Access Homogeneous Glycopeptides and Glycoproteins

The use of synthetic techniques to counter challenges faced in the preparation of homogeneous glycopeptides and glycoproteins is under investigation by several research laboratories worldwide  [37,38,39,40, 56, 67,68,69,70,71]. Solid phase peptide synthesis technique (SPPS) is the method of choice for the preparation of glycopeptides  [32]. Despite the extensive development that SPPS has undergone  [72] since its first discovery  [73], it is still limited to the preparation of up to 30- to 50-residue long glycopeptides that carry relatively small oligosaccharide units. However, the combination of SPPS and ligation techniques, particularly native chemical ligation (NCL)  [74] or expressed protein ligation (EPL)  [75], has enabled synthetic access to more complex structures including large glycoproteins  [35, 36]. The synthesis of 40- and 80 amino acid MUC1 glycoproteins bearing eight GalNAc units at corresponding Thr11 and Thr19 of tandem repeats was accomplished using a combination of 9-fluorenylmethoxycarbonyl (Fmoc) SPPS and a serine/threonine ligation technique  [76,77,78]. The first total synthesis of glycocin F, an antimicrobial 43 amino acid glycopeptide with two βGlcNAc moieties at Ser18 and Cys43 was successfully undertaken by Brimble et al.  [56]. The synthetic protocol involved initial Fmoc SPPS of three glycocin F fragments incorporating O- and S-linked GlcNAc unit using either Fmoc-Ser[GlcNAc(OAc)3] or Fmoc-Cys[GlcNAc(OAc)3] building blocks, respectively, accessed via total synthesis. Native chemical ligation  [74] was then used to join these fragments which was followed by oxidative folding to effect the desired C-amidated glycocin F  [56]. Many other examples of the synthesis of larger N-glycoproteins using ligation techniques have been reported in the literature  [35, 36]. Notable examples include the synthesis of 166 amino acid interferon-β-1a  [79], the 72 amino acid glycosylated analogue of interleukin-8 which was used in folding studies  [80], the hydrophobic glycoprotein saposin C (80 amino acids)  [81], and the 124 amino acid bovine ribonuclease (RNAse) C accessed via semisynthetic methods (EPL and NCL)  [82] or total synthesis and NCL  [83].

In general two approaches may be used to generate a linkage between an oligosaccharide and a peptide chain. The so-called ‘linear approach’ involves initial preparation of a glycosylated amino acid building block, which is then incorporated into a growing resin-bound peptide chain that is then typically extended using SPPS  [32, 38, 84]. A major limitation of this technique is the significant effort required to prepare suitably protected carbohydrate-bearing building blocks in sufficient quantities for the subsequent coupling steps. Furthermore, the attachment of complex protected oligosaccharides to amino acid residues generates significant steric hindrance, which may diminish the effectiveness of the peptide coupling steps during SPPS, leading to by-product formation. Therefore, only short- to medium-sized glycopeptides, carrying relatively small (and typically O-linked) oligosaccharide units, can be accessed via the linear approach  [33, 38, 42]. Nonetheless, this linear approach has been employed for the synthesis of antitumour vaccine candidates based on mucin glycopeptide antigens  [68, 85, 86], and for the synthesis of mannosylated peptides as components for synthetic vaccines  [48]. Other literature examples that have adopted the linear strategy include the synthesis of fluorescent glycopeptides as biological probes  [87, 88], and the synthesis of analogues of antifreeze glycoproteins  [53] in addition to others reviewed elsewhere  [84]. Kajihara et al. have employed the linear strategy to synthesise N-linked glycopeptides including a 79–85 fragment of EPO (ALLVNSS) bearing a complex biantennary sialyloligosaccharide  [89], an EPO (85–95) fragment with two different glycans (asialo- and sialyloligosaccharides) attached  [90], ligation partners for subsequent NCL to construct the full-length EPO mutants bearing one, two or three biantennary sialyloligosaccharides  [91, 92] and a 38 amino acid cytotoxic T-lymphocyte-associated protein-4 (CTLA-4) fragment 113–150 with two complex-type undecadisialyloligosaccharides  [93]. The glycopeptide thioesters bearing N-linked biantennary complex-type nonasaccharide unit required for subsequent ligation (EPL and NCL) to afford RNAse C were synthesised using the linear strategy by the group of Unverzagt  [82, 83].

An alternative and convergent strategy involves the initial synthesis of a peptide chain followed by the direct attachment of a carbohydrate unit either on-resin or in-solution  [32, 38, 40]. This technique is widely applicable for the preparation of N-linked glycopeptides where glycosylamines are attached to a pre-assembled peptide via the side chain carboxylic acid of embedded aspartic acid (Asp) residues using the Lansbury aspartylation  [94, 95]. A notable example that employed this convergent strategy to access an N-linked glycoprotein was reported by Danishefsky et al.  [69, 96, 97]. It involved the total synthesis of the 166 amino acid fully glycosylated homogeneous erythropoietin, containing N-linked branched dodecasaccharides at Asn24, Asn38 and Asn83, and an O-linked glycan at Ser126  [69, 96, 97]. A combination of Fmoc SPPS, NCL [74], O-mercaptoaryl ester rearrangement  [96] and metal-free desulfurisation  [98] was used to deliver the synthetic target. The highly complex N-linked oligosaccharides were prepared by total synthesis and introduced using a convergent selective amidation of the corresponding aspartic acid residue using a ‘one flask’ aspartylation approach  [94, 95, 99, 100]. Further examples of the synthesis of glycopeptides have also been reported  [32, 33, 36, 38, 42].

5 Enzymatic Approach to Access Homogeneous Glycopeptides and Glycoproteins

Enzymatic approaches to the glycosylation of peptides and proteins are increasing in popularity due to their simplicity, and excellent stereo- and regiochemical control  [101]. This technique may employ a variety of enzymes including glycosidases, glycosyltransferases and glycosynthases to generate desired oligosaccharide structures and sugar-peptide linkages  [101]. Glycosidases are responsible for glycosidic bond hydrolysis and catalyse the breakdown of either terminal sugar units, from their non-reducing end, or internal glycosidic bonds (exo- and endo-glycosidases, respectively)  [101]. Glycosyltransferases catalyse the formation of specific glycosidic linkages by transferring monosaccharide units from glycosyl donor substrates to corresponding acceptors  [101]. Mutant glycosidases (commonly referred to known as glycosynthases)  [102,103,104] may be used for glycopeptide and glycan synthesis. Glycosynthases have the ability to transfer activated oligosaccharides onto acceptors, and unlike glycosidases, possess little or no hydrolytic activity  [101].

Endoglycosidases acting on glycan chains of glycoproteins can be further divided into two classes, those which hydrolyse the core region of N-linked oligosaccharides embedded within the glycoprotein chain, and those which recognise O-linkages between the sugar and the protein  [105]; endo-β-N-acetylglucosaminidase (Endo-β-GlcNAc-ase, ENGase, endohexosaminidase) and endo-β-N-acetylgalactosaminidase are representative examples from each class of endoglycosidase, respectively  [105]. To date, sequence analysis of the most synthetically useful ENGase enzymes employed for the synthesis of glycoproteins has lead to their classification in the carbohydrate-active enzymes (CAZy) database  [106], as either members of family 18 or family 85 of the glycoside hydrolases (GH18 or GH85)  [107]. The family GH18 enzymes are mostly derived from bacteria and fungi, while GH85 enzymes can be found in organisms ranging from bacteria to mammals  [107].

In addition to their hydrolytic activity, endoglycosidases can also effectively transfer oligosaccharides onto corresponding hydroxyl-containing substrates by transglycosylation, or glycosylation. This dual capability makes endoglycosidases, especially the ENGases, valuable tools for the convergent synthesis of oligosaccharides and glycoconjugates  [105, 108].

6 Wild-Type Enzymes and Peptide and Protein Glycosylation

A number of Endo-β-GlcNAc-ases have attracted scientific attention  [107] due to their synthetic potential, and examples from family GH85 include Endo M (from Mucor hiemalis)  [109,110,111,112,113,114,115,116,117,118,119,120], Endo A (from Arthrobacter protophormiae)  [117, 121,122,123,124,125,126,127,128], Endo D (from Streptococcus pneumoniae)  [129, 130], Endo OM (from Ogataea minuta)  [107] and Endo-BH (from Bacillus halodurans)  [131]. Selected examples of ENGases from family GH18 are Endo H (from Streptomyces griseus) [132], Endo S (from Streptococcus pyogenes)  [133, 134] and Endo F1, F2 and F3 (from Flavobacterium meningosepticum)  [135,136,137]. All ENGases cleave the N,N′-diacetylchitobiose unit [GlcNAcβ(1-4)GlcNAc], a common motif present within Asn-linked high-mannose, complex and hybrid N-glycans  [105]. Enzyme hydrolysis is substrate specific; Endo H, Endo F1 and Endo A only act on high-mannose and hybrid N-glycans, while Endo F3 also recognises bi- and tri-antennary complex glycans  [43].

The synthetic ability of ENGases can, therefore, be used for the preparation of homogeneous glycoproteins in an enzyme-mediated glycoprotein remodelling approach (Scheme 1a)  [33, 34, 43, 105, 138]. This approach involves initial enzyme-mediated hydrolysis of the β(1–4) bond of [GlcNAcβ(1-4)GlcNAc] unit, affording a GlcNAc-tagged protein. Subsequent reattachment of another glycan to this GlcNAc acceptor using the transglycosylation activity of an ENGase then gives a glycoprotein with a desired glycan structure  [43, 105].

Scheme 1
scheme 1

a Glycoprotein remodelling approach and b convergent chemoenzymatic synthesis of glycopeptides using ENGases  [43]

The protein remodelling approach can be extended to the convergent, chemoenzymatic synthesis of glycopeptides. Herein, the synthesis of a peptide chain incorporating a GlcNAc moiety as a sugar acceptor is performed first, and is then followed by enzymatic attachment of N-glycans using ENGases (Scheme 1b)  [43]. However, the relatively low yields of products obtained during the process and hydrolytic activity of endoglycosidases often diminish their synthetic potential  [43, 139].

7 Strategies to Improve Enzymatic Glycosylation

Approaches to effect improvements in ENGase catalysed glycosylation processes include the use of N-glycan oxazolines as activated sugar donors in combination with mutant variants of ENGases with altered activity towards product hydrolysis  [43, 140,141,142,143,144,145,146,147,148,149,150].

The successful use of an oxazoline-activated disaccharide was first reported in 2001 by Shoda et al.  [151]. The Manβ(1-4)GlcNAc-oxazoline (1) was successfully used as a glycosyl donor in an Endo M  [109]- and Endo A  [121]-mediated glycosylation reaction using GlcNAcOpNP (2) as the acceptor, affording the corresponding trisaccharide 3 (Scheme 2)  [151]. The activity of oxazolines as donors for glycosylation processes catalysed by ENGases is related to the structural and functional similarities they share with oxazolinium ions, which are high energy intermediates in the enzymatic hydrolysis  [151].

Scheme 2
scheme 2

Oxazoline use for ENGase mediated glycosylation by Shoda et al.  [43, 151]

The use of sugar oxazolines as activated sugar donors in ENGase catalysed glycosylation has attracted the attention of many research laboratories worldwide  [130, 133, 139,140,141,142,143, 152,153,154,155,156,157,158,159,160,161,162,163,164,165]. The power of ENGase mediated glycosylation using sugar oxazolines was recently demonstrated by Fairbanks et al.  [166] who reported the first synthesis of phosphorylated glycoprotein bearing phosphorylated pentasaccharide unit attached via native N-linkage using the protein remodelling approach. The tetrasaccharide oxazoline in which two terminal mannose residues contained phosphate at the 6 position was accessed via chemical synthesis and attached to GlcNAc-tagged RNase B using Endo A  [117, 121,122,123,124,125,126,127,128] glycosylation activity  [166].

The extensive research on the utility of saccharide oxazolines for ENGase facilitated glycosylation of peptides and proteins [130, 133, 139,140,141,142,143, 152,153,154,155,156,157,158,159,160,161,162,163,164,165] revealed, however, that only smaller natural sugar oxazolines (di- and tri-saccharides) proved to be effective in glycosylation reactions using wild-type enzymes  [139]. When larger, natural N-oligosaccharide oxazolines were used, significant reductions in glycosylation yields were observed due to competitive hydrolysis  [139].

One potential solution to this problem might be the use of structurally modified sugar oxazolines. In principle, these highly activated glycan donors should be readily processed by ENGases to afford glycosylation products that may not be substrates for hydrolytically active enzymes due to structural changes  [140, 142].

Fairbanks et al.  [140] showed that chemically synthesised di-, tri-, tetra- and hexasaccharide-oxazolines derived from the core sections of N-linked high-mannose glycans, containing a glucose moiety in place of a central mannose unit [Manβ(1-4)GlcNAc to Glcβ(1-4)GlcNAc] were substrates for Endo M  [109] and Endo A  [121], but not Endo H  [132], and could be used to effect irreversible glycosylation. However, this approach only produces non-natural glycan structures, which may be considered a limiting factor.

Another strategy to curtail undesired product hydrolysis, and hence improve the yield of the glycosylation step, involves using mutant enzymes  [43, 139]. Site-directed mutagenesis of wild-type enzymes is used to generate the requisite mutants  [43, 139]. The idea originated from the use of glycosynthases  [102,103,104] used for oligosaccharide synthesis, and was further developed to afford mutated versions of various ENGases, including Endo A  [153, 167], Endo M  [161, 168], Endo D  [169] and Endo S  [170, 171]. These mutated enzymes allowed the synthesis of glycoproteins  [160, 161, 164] containing natural N-linked oligosaccharides. An important example is the commercially available N175Q mutant of Endo M, derived from family GH85 developed in the laboratories of Wang and Yamamoto  [168]. Exchanging Asn175 for Gln in Endo M proved superior; the mutant exhibited greater glycosylation activity with significantly reduced hydrolytic activity as compared to wild-type Endo M and other mutants investigated  [168]. Endo M N175Q became a valuable tool that may be used to access homogeneous N-liked glycopeptides and glycoproteins carrying natural high-mannose- and biantennary complex-type oligostructures  [168].

The enzymatic remodelling of immunoglobulin G (IgG)  [133, 170] using Endo S  [133, 134] further expanded the synthetic potential of ENGases to access homogeneous antibodies (Abs) bearing well-defined sugar structures. Monoclonal antibodies (mAbs) are an important class of N-linked glycoprotein therapeutic which are produced using recombinant techniques as a mixture of multiple glycoforms of variable abundance and complexity depending on the expression system and cell line used  [172]. The glycosylation pattern in the Fc region of mAbs is especially important and affects antibodies functions on immune cells via interaction with FcγR receptors  [24]. The presence of biantennary N-linked oligosaccharide units with two terminal α2,6-linked sialic acids at the two Asn297 Fc glycosylation sites enhances the activity of immunoglobulin G against cancer and infectious and inflammatory diseases  [173]. Straightforward access to pure glycoforms of mAbs is, therefore, the key to modulate their clinical effects and develop improved antibody-based therapeutics. Wong et al.  [173] recently reported the synthesis of homogeneous Rituximab IgG1 (used for the treatment of rheumatoid arthritis and cancer)  [174] using an enzymatic remodelling approach where the initial formation of a mono-GlcNAc-tagged antibody, achieved using Endo S  [133, 134] and a fucosidase from Bacteroides fragilis, was followed by ligation of the well-defined synthetic glycan oxazolines using an Endo S D233Q mutant  [170]. The synthetic utility of an enzymatic approach using mutated enzymes has been also demonstrated by Davis et al.  [171], who recently reported the synthesis of a homogeneous form of sialylated mAb Herceptin (Trastuzumab)  [175], using a the same Endo S D233Q  [170] and optimised reaction conditions (enzyme loading and oxazoline concentration). In addition to well-defined natural glycans, modified sugar oxazolines with handles or tags (such as azides or alkynes) incorporated via amidation of non-reducing terminal sialic acids of a decasaccharide unit were also incorporated onto GlcNAc-tagged Herceptin using the optimised protocol  [171].

To broaden the scope and potential applications of enzymatic glycosylation, studies on the use of structurally modified GlcNAc or alternative sugar acceptors for ENGase mediated glycosylation have been undertaken  [115, 120, 125, 135, 176, 177]. Fairbanks et al.  [177] have recently reported the tolerance of various ENGases to transfer N-glycan oxazolines 4 and 5 to a structurally altered Asn(GlcNAc) acceptor in which the hydroxyl group of the glycan unit was protected with a benzyl ether at C-3 (6), C-4 (7) or C-6 (8) (Fig. 3). The OH-3 fucosylated Asn(GlcNAc) acceptor (9) was also tested but none of the enzymes studied (WT Endo M, N175Q Endo M, Endo A, Endo D) were able to effect this glycosylation  [177]. The study revealed subtle structural preferences of each enzyme towards sugar acceptors, a factor which needs to be taken into consideration when choosing reaction partners  [177].

Fig. 3
figure 3

Tetrasaccharide- and decasaccharide-oxazolines 4 and 5, respectively, and modified glycosyl acceptors 69 targeted during the study  [177]

8 Access to N-Linked Oligosaccharides

Despite the availability of various synthetic techniques, the synthesis of glycopeptides is still challenging  [37, 38, 40, 67,68,69,70,71]. This is mainly due to the limited access to the oligosaccharide components that are generally obtained via multistep syntheses in specialised laboratories  [35]. To date, although remarkable progress in carbohydrate synthesis has been made, reliable and general routes to prepare complex oligosaccharides are still needed  [178]. Nevertheless, total synthesis gives access to a wide range of sugar constructs, either with natural or non-natural linkages.

A recent report by Shoda et al.  [179] revealed a new and very convenient method to access GlcNAc-terminating oligosaccharide oxazolines. These oxazolines can be prepared from the corresponding 2-acetamido-2-deoxy reducing sugars (10) in water by activation using 2-chloro-1,3-dimethylimidazolinium chloride (DMC) as the dehydrating reagent to afford corresponding activated sugar donors (11) (Scheme 3)  [179].

Scheme 3
scheme 3

Synthesis of the glycosyl oxazoline from unprotected GlcNAc using DMC in water  [179]

Although labour-intensive, approaches to access full-length N-oligosaccharides have been developed  [43]. A general retrosynthetic overview for the synthesis of tetrasaccharide N-oxazolines is shown in Scheme 4  [43]. Due to the susceptibility of glycosyl oxazolines to acid and/or hydrogenation conditions, oxazoline formation must take place before the final base-catalysed removal of protecting groups is effected, or alternatively, the Shoda  [179] approach described above can be used to synthesise the oxazoline in the last step. Successive glycan disconnections at C-6, and C-3 of the inner mannose unit allow for the installation of non-symmetrical glycans, which can be achieved using 4,6-benzylidene acetal protection. Subsequent formation of a challenging β-mannosidic linkage takes place by inversion of configuration at C-2 of the selectively synthesised β-glucoside accessed using the neighbouring group participation (NGP) approach. The C-2 hydroxyl of the gluco donor is protected as a levulinate (Lev) ester, which can be removed selectively allowing installation of trifluoromethanesulfonate (triflate) leaving group that is subsequently displaced by acetate affording desired β-mannoside  [43, 180].

Scheme 4
scheme 4

Generalised retrosynthetic route to N-glycan oxazolines  [43]

Selected examples of oligosaccharide oxazolines accessed using these strategies are depicted in Fig. 4  [43].

Fig. 4
figure 4

Selected oligosaccharide oxazolines synthesised  [43]

The isolation of large oligosaccharides from natural sources may conveniently bypass laborious total synthesis routes. Some complex Asn-linked N-glycans, such as a sialic acid terminated complex biantennary unit [(NeuAcGalGlcNAcMan)2Man(GlcNAc)2] and a high-mannose glycan [(Man9(GlcNAc)2)] can be obtained in significant quantities from either egg yolks  [181] or soy bean flour  [182,183,184], respectively. Isolated and purified oligosaccharides can be chemically modified to afford suitably protected or activated building blocks for further incorporation into peptide or protein chains. It has also been shown that the biantennary glycan can be further modified using branch-specific exoglycosidases to access a broad variety of Asn-linked oligosaccharides, thus providing facile access to complex sugar structures  [90].

Kajihara et al.  [92] reported the chemical synthesis of a mutated EPO variant, with alanine (Ala) residues replacing native glutamic acid (Glu) and glutamine (Gln) residues at position 21 and 78, respectively, using NCL technique  [74]. The required asparaginyl sialyloligosaccharide, namely Asn[(NeuAcGalGlcNAcMan)2Man(GlcNAc)2]-(Asn83) was initially isolated from egg yolks. It was then suitably modified with phenacyl (Pac) protecting groups (acid-labile sialic acid residues) and the Nα-amino group of Asn was masked with a tert-butyloxycarbonyl (Boc) protecting group to allow the synthesis of sialylglycopeptide α-thioesters using Boc SPPS. Subsequent ligation of the corresponding peptide fragments using NCL  [74] afforded the desired EPO mutant  [92]. This synthetic protocol was recently extended by the same research group to the synthesis of full length EPO with one mutation site (Gln78 to Ala) bearing well-defined glycoforms (biantennary sialyloliosaccharide) at one (Asn83), two (Asn38 and Asn83, Asn24 and Asn83, Asn24 and Asn38) and three (Asn24, Asn38, Asn83) native EPO N-glycosylation sites  [91].

9 Alternative Access to N-Linked Glycans and N-Linked Glycopeptides Using Glycosidic Bond Mimetics

The synthesis of glycoconjugate mimetics is an alternative approach to construct sugar structures or incorporate glycans onto peptides in a simplified way as compared to a total synthesis approach. The introduction of a glycosidic bond mimetic may improve the stability of the glycoconjugate towards chemical and enzymatic degradation, which is highly beneficial for pharmaceutical applications  [185]. The use of the copper(I)-catalysed Huisgen 1,3-dipolar cycloaddition of alkynes and azides to afford a 1,2,3-triazole conjugate (CuAAC ‘click chemistry’)  [186, 187] has increased in popularity in recent years, in the peptidomimetic field and as a bioconjugation strategy, due to its simplicity, the mild reaction conditions, its tolerance of various functional groups and its complete regioselectivity to form 1,4-disubstituted products  [188].

The syntheses of ‘click’ mimetics of fish antifreeze glycopeptides  [49, 189] and a 20 amino acid MUC 1 domain  [54] were successfully undertaken in our laboratory  [190]. We have also developed a powerful strategy where two ligation techniques, NCL  [74] and CuAAC  [186, 187], are carried out in a sequential manner to afford ‘click’ neoglycopeptides in a ‘one pot’ fashion  [51]. Another highly attractive method which combines the CuAAC strategy  [186, 187] and Shoda’s  [179] direct synthesis of sugar oxazolines from reducing sugars in water to afford 1,2,3-triazole-linked glycoconjugates in a ‘one pot’ reaction was recently reported [191]. This strategy allows the facile conjugation of reducing sugars with a diverse array of alkynes, including other sugars and peptides. This methodology can potentially be used as a simpler alternative to access homogeneous glycopeptides and possibly glycoproteins in cases where installation of the native N-linkage using an enzymatic approach fails or efficient access to glycosidic bond mimetics is required.

10 Applications of Convergent Enzymatic Glycosylation for the Synthesis of Glycopeptides with Therapeutic Potential

Our on-going interest in the synthesis of peptide-glycoconjugates  [48, 52, 53, 56, 87] and glycopeptide mimetics that contain non-natural glycan-peptide linkages (neoglycopeptides)  [190] using total synthesis prompted us to investigate the alternative enzymatic approach. Herein, a summary of our recent work on the convergent chemoenzymatic synthesis of N-linked glycopeptides with therapeutic potential is described.

10.1 Synthesis of a Library of Glycosylated Analogues of Pramlintide

Glycosylation of peptides and proteins is an important tool for producing therapeutic peptidomimetics with improved physicochemical and pharmacokinetic profiles  [20, 21]. With this idea in mind, we investigated the effect of the N-linked glycosylation of the therapeutic peptide pramlintide (Symlin®), a 37-amino acid synthetic analogue of amylin that is currently used in conjunction with insulin for the treatment of type 1 and type 2 diabetes (Fig. 5)  [192,193,194].

Fig. 5
figure 5

Primary sequence of pramlintide and human amylin

Based on the promising previous results obtained from in vitro and in vivo studies on the N-glycosylation of pramlintide at Asn3 and Asn21 with mono-, penta- and undecasaccharides  [165], we undertook a systematic investigation into the effect of glycan structure and the position of the attachment of the N-glycan to the pramlintide peptide on the activity of glycosylated peptides to act as agonists of amylin receptors  [58]. There are six possible sites for N-glycosylation of pramlintide; Asn3, Asn14, Asn21, Asn22, Asn31 and Asn35. The synthetic strategy was designed to accommodate the presence of GlcNAc units at defined Asn residues within the pramlintide sequence for subsequent enzymatic transfer of more complex sugar structures. In addition, the disulfide bond (Cys2/Cys7) and amidated C-terminus of the peptide had to be installed, as both features are required for biological activity of pramlintide  [195].

First, we synthesised a non-glycosylated pramlintide as a control peptide, which was prepared using microwave-enhanced Fmoc SPPS to afford the reduced pramlintide precursor 12. Disulfide bond formation of 12 was subsequently carried out upon the activation using 2,2′-dipyridyl disulfide (DPDS) in dimethyl sulfoxide (DMSO)  [196] to afford the cyclic (Cys2/Cys7) and C-amidated pramlintide (13), Scheme 5.

Scheme 5
scheme 5

Synthesis of pramlintide 13, and pramlintide analogues 2025 containing a GlcNAc residue at Asn3, Asn14, Asn21, Asn22, Asn31 or Asn35. Reagents and conditions: a DPDS, DMSO, rt; b DPDS, DMSO; then 13% NH2NH2·1.5 H2O, rt; c 5% NH2NH2·1.5 H2O, 10% DMSO, 85% 6 M Gu·HCl, 17 h, rt; d DPDS, DMSO; then 5% NH2NH2·1.5 H2O, rt  [58]

For the synthesis of monoglycosylated pramlintide analogues (2025), comprising a GlcNAc unit at specific Asn residues, microwave-enhanced Fmoc SPPS was employed. The GlcNAc substitution was introduced using the per-O-acetylated Fmoc-Asn[GlcNAc(OAc)3] building block (38)  [197] to give linear, sugar hydroxyl protected pramlintide analogues 1419. Subsequent use of a ‘one pot approach’ developed by Hojo et al.  [198] to form the Cys2/Cys7 disulfide bond with simultaneous removal of the glycan hydroxyl acetate protecting groups required long reaction times (17 h) to obtain the desired product 23. We found that the reaction was significantly accelerated when both reactions were performed sequentially in the same vessel whereby the linear, acetate-protected glycopeptides 1416, and 1819 were first treated with DPDS in DMSO to effect disulfide bond formation, then hydrazine hydrate was added to deprotect the sugar hydroxyls  [58]. For the synthesis of 24 and 25, the total reaction time was significantly reduced to 5 and 5.5 h, respectively. Faster acetate removal was achieved using a higher concentration of hydrazine hydrate (3 h in total, for analogues 2022), Scheme 5.

A library of pramlintide analogues 2631 bearing the core N-glycan pentasaccharide [Man3(GlcNAc)2] was then synthesised using Endo A  [121, 128] to transfer tetrasaccharide oxazoline 4 (accessed via total synthesis  [140]) to the corresponding GlcNAc-tagged pramlintide 2025 (Scheme 6).

Scheme 6
scheme 6

Synthesis of pramlintide analogues 2631 containing the core N-glycan pentasaccharide [Man3(GlcNAc)2] at position 3, 14, 21, 22, 31 or 35  [58]

This methodology was then extended to the preparation of pramlintide analogues 3237 bearing a complex biantennary glycan [(NeuAcGalGlcNAcMan)2Man(GlcNAc)2]. In this case treatment of the decasaccharide-oxazoline 5, synthesised from the corresponding reducing sugar that was isolated from egg yolks  [181, 199], with the commercially available Endo M N175Q mutant  [168, 177] in the presence of 2025 enabled the preparation of pramlintide analogues 3237, (Scheme 7)  [58].

Scheme 7
scheme 7

Synthesis of pramlintide analogues 3237 containing a complex biantennary glycan [(NeuAcGalGlcNAcMan)2Man(GlcNAc)2] at position 3, 14, 21, 22, 31 or 35  [58]

A comprehensive series of 18 pramlintide analogues comprising mono-, penta- and undecasaccharides (2037) were then tested as agonists of amylin receptors and their activity was compared to parent pramlintide (13). The parent pramlintide 13 and analogues 2025 bearing GlcNAc unit were screened against the three best characterised amylin receptors (CT(a), AMY1(a), and AMY3(a), which contain the CT(a) splice variant of the calcitonin receptor)  [200, 201] at which activity of pramlintide is equal to human or rat amylin  [202]. Analogues 2637 containing more complex glycans were only tested against AMY1(a) analogously to the previous study  [165].

The study revealed that the presence of N-glycans was well tolerated at Asn21, Asn31 and Asn35 by the AMY1(a) receptor, and that the activity of analogues versus the amylin receptors decreases as the size of the glycan increases (GlcNAc > pentasaccharide > undecasaccharide). It was therefore established that N-glycosylation of pramlintide is a promising tool to afford analogues with improved therapeutic potential. In vivo studies to assess the biological importance of N-glycosylation of pramlintide are under investigation and results will be reported in due course  [58].

10.2 Synthesis of Mannosylated Glycopeptides

Our interest in the synthesis of glycopeptide-based vaccine candidates comprising mannose units to target antigen presenting cells (APCs) responsible for initiating an immune response via mannose receptor (MR), led us previously to prepare mono- and di-mannosylated and 5(6)-carboxyfluorescein (5(6)-CF) labelled glycopeptides by chemical synthesis  [48, 87]. Subsequent progression of this work involved the synthesis of glycopeptide-based vaccine candidates comprising more complex high-mannose-type N-glycans and testing their ability to bind to APCs  [57]. For this purpose, the pp65 protein fragment 491–509 from the cytomegalovirus (CMV) [ILARNLVPMVATVQGQNLK] incorporating peptide epitope pp65495–503 (NLVPMVATV) recognised by human cytotoxic T-lymphocyte (CTL) was chosen as a synthetic target. The target peptide contains two asparagine residues Asn5 and Asn17 conveniently located within the pp65491–509 sequence for potential attachment of sugar residues. To allow detection of the peptides using flow cytometry 5(6)-carboxyfluorescein was attached to the N-terminus of the glycopeptides.

5(6)-CF-labelled control peptide 39, and glycopeptides 40 and 41 comprising either one- or two-GlcNAc units, respectively, were synthesised using microwave-enhanced Fmoc SPPS wherein the Fmoc-Asn[GlcNAc(OAc)3] building block 38  [197] replaced either Asn17 (for 40) or Asn5 and Asn17 (for 41) as required. Subsequent reduction of the methionine sulfoxide of the control peptide, used in place of methionine, was performed following literature procedures  [203] and afforded the pp65 protein fragment 491–509 (39). Removal of sugar hydroxyl protecting groups of the per-O-acetylated pp65491–509 precursors of 40 and 41 (sodium methoxide in methanol) was undertaken prior to reduction of the methionine sulfoxide  [203] to give GlcNAc-tagged glycopeptides 40 and 41 ready for further enzymatic glycosylation (Fig. 6).

Fig. 6
figure 6

CMV control peptide 39 and glycopeptides 40 and 41 containing a GlcNAc residue either at Asn17, or at both Asn5 and Asn17, respectively  [57]

With GlcNAc-tagged glycopeptides 40 and 41 in hand, we next focused on the synthesis of N-glycopeptides bearing a Man3-terminated pentasaccharide. This was successfully undertaken using oxazoline donor 4  [140], glycopeptide acceptors 40 and 41, and the Endo A E173H mutant [153] to afford 42 and 44, bearing either one- or two-pentasaccharide units, respectively, in good yield (68 and 89%, respectively), Scheme 8a.

Scheme 8
scheme 8

Synthesis of a glycopeptides 42 and 44 containing a Man3(GlcNAc)2 residue, and b glycopeptides 43 and 45 containing a Man9(GlcNAc)2 residue, either at Asn17, or at both Asn5 and Asn17, respectively. Reagents and conditions: a Endo A E173H, sodium phosphate buffer pH 6.5; b Endo M N175Q, sodium phosphate buffer pH 6.5  [57]

To access full-length high-mannose N-linked glycopeptides 43 and 45 bearing nine mannose units at each glycosylation site, oxazoline donor 46 was used which was conveniently sourced from soy bean flour  [161]. Somewhat surprisingly the Endo A E173H mutant  [153] proved incapable of transferring the full-length high-mannose oxazoline 46 onto glycopeptide acceptors 40 and 41, possibly due to an altered substrate tolerance of the mutated enzyme as compared to wild-type Endo A  [57]. We, therefore, used commercially available Endo M N175Q mutant  [168, 177] which produced the desired N-undecasaccharide-glycopeptides 43 and 45 with either one- or two-Man9(GlcNAc)2 units in 48% and 54% yield, respectively (Scheme 8b).

Subsequent analysis to assess glycopeptide binding levels to APCs indicated improved targeting of the peptide cargo to MR-expressing cells due to the presence of the high-mannose N-glycans. This effect was more pronounced for analogues glycosylated at both asparagines (44 and 45) as compared to counterparts bearing a single N-glycan at Asn17 (42 and 43). In addition, stronger binding was observed for glycopeptides bearing the high-mannose unit, Man9(GlcNAc)2 (43 and 45) than those with the truncated glycan, Man3(GlcNAc)2 (42 and 44). Importantly, it was also found that analogues in which the sugars were sited outside the epitope sequence (either Man3(GlcNAc)2 or Man9(GlcNAc)2 at Asn17, 42 and 43, respectively) were readily processed and presented by the APCs to human T cells.

These results provide important evidence that N-glycosylation of peptides using high-mannose glycans may produce superior compounds for vaccine development. Additionally, we have demonstrated the effectiveness of a convergent chemoenzymatic approach to readily obtain complex N-linked glycopeptides with therapeutic potential.

11 Conclusions

The synthesis of homogeneous peptides and proteins is still a complex and onerous task. The synthesis of the sugar component is a limiting factor, especially when laborious total synthesis routes are employed. Fortunately, recent progress in the use of chemoenzymatic techniques using ENGase-mediated glycosylation has demonstrated significant potential. By careful design of reaction conditions and appropriate selection of partners for glycosylation, a wide range of peptide-oligosaccharide structures can be obtained. The use of enzyme-mediated synthesis in combination with chemical synthetic techniques provides a method to access complex, highly desirable glycoconjugates efficiently  [101, 204].