Keywords

1 Glycoproteins: Molecules with Great Heterogeneity

The majority of all membrane and secreted proteins [1], as well as numerous cytoplasmic proteins [2, 3], have one or several specific branched oligosaccharide chains (glycans) attached to their backbone. Those proteins are referred to as glycoproteins and the process of oligosaccharide attachment to a protein is called glycosylation.

Glycosylation is an enzymatic, highly regulated co-translational process by which glycans are added to protein and lipid backbones, forming glycoconjugates [4]. Monosaccharide units that form glycans are linked via glycosidic bonds, which can be α linkages or β linkages depending on the relationship of the oxygen from hydroxyl group to the anomeric carbon. It is important to understand that these linkages give a different structural properties and biological functions to the sequences that are otherwise identical in composition [5]. Differences in monosaccharide composition, anomeric state, linkage of the subunits, branching and linkage to the peptide part of a glycoprotein are all contributing to the diversity of the glycan portion of the glycoprotein [6, 7]. Glycans have numerous important structural, functional and regulatory roles including protein degradation, folding and secretion, cell signalling, immune function and transcription [3, 8, 9]. The portion of glycans in glycoproteins can vary significantly, but glycans usually make a substantial part of their mass [10].

Due to the nature of linkage, by which they are attached to polypeptide backbones, glycans in eukaryotes are divided in N-linked glycans and O-linked glycans [4]. An N-linked glycan is an oligosaccharide covalently linked to an asparagine residue of a polypeptide chain within the consensus sequence: Asn-X-Ser/Thr. An average N-glycan is a complex non-linear oligosaccharide composed of 10–15 monosaccharide residues. They are transferred to protein moiety on the luminal side of the endoplasmatic reticulum (ER) membrane while the protein itself is synthesized on ER-bound ribosomes and is translocating in the ER membrane. These glycans have a common core region made of five monosaccharides and then further differ by the subsequent sequence and form three subgroups: high mannose—only mannose residues are attached to the core, complex—two or more antennae are attached to core via N-acetylglucosamine (GlcNac) and hybrid type N-linked glycans—mannose residues attached to the Manα1-6 arm and one or two antennae to the Manα1-3 arm (Fig. 27.1).

Fig. 27.1
figure 1

Examples of three N-glycan types: oligomannose, complex and hybrid. All N-glycans share a common core (red rectangle)

An O-linked glycan is typically linked to the polypeptide chain, usually via N-acetylgalactosamine (GalNAc) to a serine or threonine residue and can exist as variety of different structural classes [4] (Fig. 27.2).

Fig. 27.2
figure 2

Eight mucin-type O-glycans core structures

Each glycosylation site, on a given protein synthesized by a particular cell type, can carry glycans that have a great number of variations in their structure. The intriguing microheterogeneity of glycans results in a formation of great number of protein glycoforms with different protein properties and functions [10]. This characteristic makes the structural analysis of glycans very hard and demanding task [11] and is responsible for the different chromatographic and electrophoretic behaviour. A complex glycan repertoire in mammals is estimated to be around thousands of glycan structures and could be larger than the proteome [6].

While structure of the polypeptide part of a glycoprotein is defined by the sequence of nucleotides in the corresponding gene, structure of a glycan part results from dynamic interactions between hundreds of genes, their protein products and environmental factors [9, 12].

1.1 Glycans Differ Between Organisms

Relatively little is known about the glycan diversity. The glycans synthesized by most Eubacteria and Archaea have structurally not much in common with those of eukaryotes. The high levels of diversity encountered in the best-studied vertebrate species indicate similar diversity in other groups of organisms. There can also be significant variation in glycosylation among members of the same species, particularly with regard to terminal glycan sequences [13]. Still, most major glycan classes identified in animal cells seem to be represented in some related form among other eukaryotes, and sometimes in Archaea. Viruses often do not directly glycosylate their own glycoproteins, but instead utilize host-cell machinery. However, there are some exceptions to this rule.

All plants and animals studied until now seem to share the same early stages of the classic N-glycan processing pathway. Yeasts and vegetative slime molds do not appear to complete the trimming of mannose residues, and are thus unable to generate typical complex-type N-glycans [4]. In insects, mannose trimming appears to be generally completed, as in mammals, down to a Man3GlcNAc2 structure. Also, in insect cells an α(1-3)-fucose unit is often added to the core GlcNAc residue (frequently in addition to the α(1-6)-fucose typically found on the core GlcNAc of vertebrate N-glycans). Plants follow a pathway similar to that in vertebrates in the initial stages, but then often add an α(1-3)-fucose residue on the proximal GlcNAc instead of an α(1-6)-fucose residue in vertebrates, and a β(1-2)-xylose residue linked to the β-mannose [14]. These structures are also present in some invertebrates, but they appear to be immunogenic in vertebrates [15].

The common O-glycan core-1 Galβ1-3GalNAcα1-O-Ser/Thr structure of vertebrates is also present in insects [4]. Plants do not appear to have O-linked GalNAc. Instead, they express arabinose O-linked to hydroxyproline and galactose O-linked to serine and threonine.

Despite the huge potential for structural diversity built into monosaccharides, a relatively limited subset of all possible monosaccharides and their possible linkages and modifications are found in eukaryotic cells. Bacteria and Archaea express a much greater diversity in glycosylation.

2 Enzymatic Deglycosylation of Glycoproteins

2.1 Removal of N-Linked Glycans

To study the glycosylation of the protein, its structure or function, it is often needed to remove its glycans. The most widely used enzyme for deglycosylation of glycoproteins is peptide-N-glycosidase F (PNGase F) [16]. This enzyme efficiently releases N-linked glycans from glycoproteins and glycopeptides [17, 18] (Fig. 27.3). The minimum substrate for its action is a tripeptide with a glycan linked on its central asparagine residue. After the enzymatic release, the glycan structure is left intact and the only modification to the protein is deamination of asparagine from which glycan has been removed. Enzyme efficiently removes all types of N-linked glycans, except those with fucose α(1,3) linked to the core N-acetylglucosamine of the glycan, structure usually found in plants and insects. For release of such glycans, N-glycosidase A (PNGase A) must be used [16, 19] (Fig. 27.3). PNGase A is an enzyme isolated from almond meal and is ineffective for sialylated glycans.

Fig. 27.3
figure 3

Endoglycosidases PNGase F and PNGase A cleavage sites. PNGase F cleaves all N-linked glycans unless the core contains an α(1-3) fucose, whereas PNGase A also hydrolyzes glycans containing an α(1-3)-linked fucose residue linked to the asparagine-linked GlcNAc. However, PNGase A is ineffective when sialic acid is present on N-glycan

These two enzymes are the most commonly used endoglycosidases, since others, like Endoglycosidase H and Endoglycosidase F [20], have limited specificity and cleave the glycan after the first N-acetylglucosamine, leaving it attached to the asparagine residue (Fig. 27.4).

Fig. 27.4
figure 4

Endoglycosidases H, F1, F2 and F3 cleavage sites. They all cleave the bond between two GlcNAc subunits directly proximal to the asparagine residue. Endo H and Endo F1 cleave oligomannose and hybrid glycans, but not complex glycans, while Endo F2 and Endo F3 cleave complex structures

Although there are proteins from which PNGase F can cleave glycans while they are in their native form, and only prolonged incubation time (up to 3 days possible for PNGase F) is sufficient to improve the efficiency, many proteins need to be denaturated before deglycosylation due to glycan location. Detergent and heat denaturation increases the PNGase F rate of cleavage up to 100 times. The PNGase F enzyme is active at the pH range 6–10, and pH 8.6 is optimal for its action.

Contrary to PNGase F, action of endoglycosidases F (F1, F2 and F3) is less affected by protein conformation. Endoglycosidases F are linkage specific [18, 21] and Endo F1 can cleave only high mannose and hybrid structures, while complex glycans cannot be removed with this enzyme. Endo F2 acts on biantennary complex and high mannose oligosaccharides (with lesser extent), but will cleave hybrid structures. Endo F3 best cleaves core fucosylated biantennary structures, while non-fucosylated biantennary and triantennary structures can be cleaved with this enzyme only if they are linked to the peptide. It also cleaves fucosylated trimannosyl core structures on free and protein-linked oligosaccharides, but native deglycosylation of complex tetrantennary glycans requires sequential hydrolysis down to the trimannosyl-diacetylchitobiose core. Endoglycosidase H is a highly specific endoglycosidase which cleaves only high mannose oligosaccharides.

2.2 Removal of O-Linked Glycans

Enzymatic release of O-glycans has a much more limited use owing to the very narrow substrate specificity of the enzymes available [21]. O-glycosidase can efficiently remove only core Gal-β(1-3)-GalNAc structure of O-glycans [22], so the rest of the O-glycan has to be prior removed by sequential actions of exoglycosidases. The denaturation of glycoprotein does not seem to contribute significantly to deglycosylation action of O-glycosidase, but any modification of the core structure can block the action of enzyme.

Due to these facts, chemical release of O-linked glycans is the method of choice, but also has its limitations, since unwanted degradation called a “peeling” reaction can occur [23]. This refers to a base-catalyzed elimination reaction which results in the loss of the 3-substituent of the innermost residue of an oligosaccharide.

2.3 Exoglycosidase Sequencing

Exoglycosidases are enzymes which break the glycosidic bonds at the terminal residue and release small sugars. These enzymes have many roles in nature including, for instance, degradation of cellulose, anti-bacterial defense strategies, in pathogenesis mechanisms (e.g. viral neuraminidases) and in normal cellular function (e.g. trimming mannosidases involved in N-linked glycoprotein biosynthesis).

Exoglycosidases are very useful in release of O-glycans where sequential actions of exoglycosidases are required prior to enzymatic release of the core of O-linked structures by O-glycosidase [21]. Since these enzymes show specificity for certain monosaccharide and in many cases even for the type of linkage (Fig. 27.5), exoglycosidases are also often used in analysis of unknown N-linked glycan structure [24, 25]. Sequential removal of monosaccharides from the glycan terminus changes the feature of glycan that is studied in the analysis (e.g. chromatographic behaviour) (Fig. 27.6) and from this change the nature of monosaccharide and glycosidic bond can be concluded [25, 26].

Fig. 27.5
figure 5

Commonly used exoglycosidases and their cleavage sites

Fig. 27.6
figure 6

Sequencing of biantennary disialylated glycan using exoglycosidases and hydrophilic interaction liquid chromatography (HILIC-HPLC)

It is important to mention that all enzymatic deglycosylation procedures can be done on the glycoproteins dissolved in buffer [27], or incorporated in gel, so the enzymatic release is often done on the glycoproteins separated by SDS-polyacrylamide gel electrophoresis [26], or the proteins are just incorporated in little gel blocks in order to be immobilized [28]. The difference between these two approaches is also in the post-deglycosylation purification of glycans. Glycans released from gel do not require additional purification from polypeptide parts.

3 Chemical Deglycosylation of Glycoproteins

3.1 Hydrazinolysis

Hydrazine hydrolysis completely releases unreduced O- and N-linked oligosaccharides. Selective and sequential release of oligosaccharides is possible and can be achieved by initial mild hydrazinolysis of the O-linked oligosaccharides at 60 °C followed by N-linked oligosaccharides at 95 °C [29, 30]. Hydrazine hydrolysis preserves the glycan intact but it destroys the protein component.

Hydrazine hydrolysis is accomplished by the addition of fresh anhydrous hydrazine to a salt-free, lyophilized glycoprotein sample. In the end the dried sample should be re-N-acetylated (by the addition of ice-cold saturated sodium bicarbonate solution, followed immediately by the addition of acetic anhydride) [29, 30].

Glycan and protein components can be separated by paper chromatography or by gel filtration.

3.2 Alkaline β-Elimination

O-glycosidic linkages between glycans and the β-hydroxyl groups of serine or threonine can easily be hydrolyzed by dilute alkaline solutions (0.05–0.1 M sodium hydroxide or potassium hydroxide) under mild conditions (45–60 °C for 8–16 h) leading to the liberation of O-glycans [31]. This reaction is performed in the presence of a reducing agent (sodium borohydride) to prevent isomerization or degradation of the carbohydrates [32].

N-Linked glycans are not cleaved under these conditions, nor are O-glycans attached to tyrosine, hydroxyproline and hydroxylysine. Also, the β-elimination reaction does not occur if the glycan is attached to serine or threonine at the carboxy-terminus of the protein.

For quantitative release of N-linked glycans, stronger alkaline conditions are required (1 M sodium hydroxide at 100 °C for 6–12 h) [31]. Again, the reaction should be performed under reducing conditions. GlcNAc residues are de-acetylated during this reaction and must be re-N-acetylated. Recovery of the glycans can be achieved with acetic anhydride in methanol.

3.3 Trifluoromethanesulfonic Acid

Trifluoromethanesulfonic acid hydrolysis removes all types of glycans. The only exception is the innermost asparagine-linked GlcNAc of N-linked glycans, which is attached to the protein by an amide bond and is stable to trifluoromethanesulfonic acid [33].

This reaction leaves the protein component intact, but leads to glycan destruction.

4 Labeling of Glycans

In contrast to proteins and peptides, glycans do not possess strong chromophores and therefore they do not absorb ultraviolet light strongly. As a result a wide range of alternative techniques have been developed for the detection of glycans [34]. Chemical derivatization is the most common method used for glycan labeling at their reducing ends and the reductive amination [35] is the most widely used reaction for glycan derivatization, although Michael addition (labels that are detected by UV are mostly used) [36], permethylation [37] or hydrazide labeling [38] may also be applied. Various compounds which provide the required functional group for the labeling reaction can be used. Reductive amination, Michael addition, and hydrazide labeling all require the reducing end of the glycan, which is not present on O-glycans released by reductive β-elimination. Labeling reagents are usually added in large excess, and in most labeling procedures excess label needs to be removed. Commonly gel filtration and solid phase extraction (SPE) are applied for excess label removal [34].

Although there are several glycan labels that are suitable for UV detection, many other biomolecules also absorb UV light, so the specificity is generally low. Therefore alternative detection techniques, which also achieve better sensitivity, are usually preferred. Fluorescence detection is the most commonly used optical detection method for glycan analysis (Table 27.1). A single molecule of fluorescent label can be incorporated to each glycan, thus allowing quantitation.

Table 27.1 Fluorescent labels widely used in glycan analysis

4.1 Reductive Amination

Glycans can be labeled at their reducing end using reductive amination. The amino group of the dye label couples with the open ring form of the glycan to form a Schiff’s base. The Schiff's base imine group is chemically reduced by a reducing agent to give a stable labeled glycan. An advantage of this labeling technique is the stoichiometric attachment of one label per glycan, allowing a direct quantitation. The optimal conditions for the reductive amination reaction of N-glycans with the 2-AB and 2-AA labels were reported by Bigge et al. [35].

5 Glycan Purification

Glycans often have to be purified prior to analysis from non-carbohydrate material including salts, proteins, and detergents. If labeled, glycans also have to be purified from excess of the labeling reagent. Most commonly used techniques for glycan purification are: solid phase extraction (SPE) [27, 28], liquid–liquid extraction [39], paper chromatography [40], gel filtration [41] and precipitation (mainly to remove proteins) [42].

Solid phase extraction is the most suitable purification technique for high-throughput protein glycosylation analysis [27]. There are numerous SPE stationary phases that can be used for glycan purification: reverse phase, porous graphitized carbon, hydrophilic interaction liquid chromatography (HILIC) and anion-exchange chromatography [34].

HILIC SPE is often used for purification of glycans derivatized by reductive amination. Several HILIC stationary phases have been described: cellulose, Sepharose, diol-bonded silica beads, polyacrylamide based phase [43]. This approach involves application of the glycan sample to the certain type of hydrophilic matrix in the presence of high levels of particular organic solvents. The glycans bind to the matrix (on the basis of their hydrophilic properties) and hydrophobic non-glycan contaminants are washed off with the solvent. The purified glycans are then subsequently eluted with an aqueous solvent.