Introduction

Biological sciences have recently undergone enormous expansion, generating new fields with new names. All rapidly developing fields in life sciences feature a common trend: from empirical and descriptive biology toward the understanding of molecular and dynamic aspects. Since the early 1990s on, new measurement technologies further lend exciting opportunities for quantifying the molecular aspects of processes in living organisms and combining these extensive molecular data into the “systems biology” knowledge [15]. Among the different “-omics fields” (genomics, transcriptomics, proteomics, metabolomics, etc.), which are all closely related to new developments in analytical methodologies and instrumentation, the fields of glycoproteomics and glycomics are now finally assuming a very important role. Both fields contribute substantially to a better understanding of multicellular interactions in eukaryotic systems and the many issues pertaining to human health and disease [612]. The long-held views that glycosylation in prokaryotic systems is unimportant have been seriously challenged during the last several years [1319]. Since glycans seem to provide the first interfacial layer between the mammalian host and parasites, glycosylation plays a crucial role in pathogenicity and invasion as well. As some microorganisms appear to express distinctly different sugar units and oligosaccharide structures, the area of host-pathogen interactions has already started to create its own bioanalytical challenges.

It is now evident that the sheer complexity of carbohydrate structures and their complicated (non-template) biosynthetic origin discouraged previous generations of scientists to work in glycoscience. From a molecular scientist’s point of view, this field encompasses a vast range of directions due to the structural complexities of glycoconjugates, of which the glycoproteins represent just one class of biologically important molecules. This brief, authoritative review emphasizes glycoprotein analysis at high sensitivity and shares common methodological concerns and directions with the structural studies of other large glycomolecules encountered in nature (such as proteoglycans or polysaccharides), which undoubtedly other investigators will address and review. The focus of our brief review has primarily been to describe the methodological advances of interest to modern biomedical research. The reviewed methodologies emphasize the use of mass spectrometry and capillary separation techniques.

General considerations

The emergence of “glycobiology” during the late 1980s serendipitously coincided with the development of new ionization techniques in biomolecular mass spectrometry (MS) and the recognition of capillary electrophoresis (CE) and capillary liquid chromatography (LC) as new important techniques in biochemical analysis. The importance of glycosylation has since been widely recognized functionally and structurally. The current estimate of some 50 to 70 % of all mammalian proteins being glycosylated represents a formidable analytical challenge that far exceeds the tasks of the mainstream proteomics. It is becoming increasingly evident that the structural selectivity determinants of biological recognition are either unique glycan structures or combinations of different glycans at the sites of glycosylation and, to some degree, the local “peptide landscape” at the site of a glycosylated protein. The well-known propensity of sugars to form numerous glycan isomers adds substantially to the difficulties of the overall analytical glycobiology tasks. It is thus desirable that glycomic and glycoproteomic measurements be practiced with a comparable level of expertise and probably best if performed in the same laboratory. Different techniques and instrumentation may need to be involved in completion of the overall structural task.

Still during the early 1990s, milligram quantities of glycoproteins were typically required for structural analysis; the use of ion-exchange LC with pulsed amperometric detection was widely considered as the “gold standard” for glycomic profiling. Over the two following decades, impressive gains in MS detection technologies and instrumentation were gradually achieved for the benefits of proteomic sequencing and mass screening. Simultaneously, the development of fluorescence-labeling for either LC [20, 21] or CE with laser-induced fluorescence (LIF) detection [2226] has promoted the importance of sample preparation prior to the final glycomic or glycoproteomic measurements. There has been evolution of the separation columns used for the benefits of MS. Various columns with ever smaller diameters, whether in their packed, monolithic, superficially-porous, or open tubular and microfabricated formats, have gradually become a part of many on-going efforts to down-size our analytical tools. The general benefits of miniaturized columns have been achieved with increased mass sensitivity, in minimum sample dilution prior to measurements, and a reduced surface exposure of the analyzed glycoproteins. These miniaturization efforts have also resulted in a better integration of different analytical steps, reduced sample scales and better automation possibilities in both glycomics and glycoproteomics. Certain advantages of these operations will be featured below in some detail. Overall, in different incremental steps, significant sensitivity gains have been achieved during the last decade, amounting to several orders of magnitude.

The field of analytical glycobiology is somewhat arbitrarily divided into “glycomics”, emphasizing the functional centrality of the glycomes in life processes and thus measuring oligosaccharide profiles as its primary task, and “glycoproteomics”, dealing with the overall glycoprotein, structure at all structural levels. When dealing with unfractionated biological materials, the vast complexity and abundance of the structural and quantitative data in both glycomics and glycoproteomics invites the advanced uses of information technology in automated identification of proteins and glycans, mass screening, and library comparisons. The field of bioinformatics has evolved rapidly in these areas and united a number of laboratories at the international level [2732].

While different analytical technologies continue to evolve at their own pace, the most valuable incentive for their development is provided by some of the most topical areas of scientific endeavor such as the search for disease biomarkers (for the benefits of both biomedicine and pharmaceutical development) and investigations into the molecular basis of host-pathogen interactions.

Recent analytical directions and strategies

There has been a dramatic increase in the laboratories working on the subject of glycoproteins, as evidenced by the large and growing number of published studies in both biomedical and analytical journals. Although such studies employ a variety of methods and approaches, developments in glycomics and glycoproteomics lag behind those in genomics and “mainstream proteomics.” This situation arises from the complexity of mammalian glycomes that are estimated to contain many thousands of oligosaccharides generated through a non-template biosynthetic process.

Glycoproteomics

It is now commonly perceived that the analysis of major, isolated glycoproteins or recombinant products (confirmatory analysis) has become a nearly routine task. The current methodological attention has been more directed toward the analysis of complex mixtures and biological samples derived from body fluids, tissues, cell cultures, etc. Due to the large differences in the occurrence and concentration of glycoproteins in such biological specimens, methodological difficulties are often encountered. There is a need to develop more generic platforms to account for glycoproteins that are variously distributed in different cells and subcellular entities, as the cellular compartmentalization is certainly important for the biosynthesis, structural modification and the function of glycans. While numerous studies have been conducted toward optimization of the extraction schemes to cover both the membrane-bound and soluble proteins from tissue materials, many procedures remain somewhat application-specific. In contrast, the sample treatment procedures for the common physiological fluids, such as human blood serum or plasma, tend to be more unified at present. However, the enormous concentration range in which the glycoproteins of interest may be encountered in such fluids still presents a formidable challenge, inviting sample preconcentration schemes based on affinity chromatography.

A highly effective MS analysis of glycopeptides after enzymatic degradation is at the heart of all glycoproteomic strategies, whether they use initially the “top-down” approach, separating proteins first, or the “bottom-up” approach, digesting the protein mixtures directly. Sample treatment, separation procedures and data processing generally parallel those of the advanced (high-sensitivity) general proteomics, reviewed amply elsewhere [3335]. The added difficulties include: (a) separation of glycoproteins from non-glycosylated proteins prior to digestion, and (b) difficulties of adequately measuring, or even detecting, the protease-digested glycopeptides in the presence of other peptides through the tandem MS (MS/MS) methodologies. Glycopeptides easily suffer from suppressed ionization in the presence of other components.

Multimethodological strategies for glycoproteomic profiling

At the level of analyzing complex protein mixtures, the use of at least two-dimensional separation technologies appears mandatory, albeit at some expense in both reproducibility and time of analysis. As is evident from the number of recent publications, methods and protocols, and reviews, various combinations of LC, CE and MS techniques are applicable in the search for the best protein and peptide mapping strategies [3638]. Additionally, traditional 2-D gel electrophoresis and other modified gel-based methods continue to be utilized and further developed [19, 3941]. The reversed-phase capillary LC has become a well-established procedure in peptide mapping when using either UV absorbance detection or LC-MS with low-microgram protein digests for simple peptide mixtures or digests with a moderate degree of complexity. However, peptide mixtures of greater complexity necessitate columns with high efficiencies, optimized separation conditions, and alternatively, the use of multidimensional chromatographic techniques [42, 43]. Generally, microgram quantities of the protein mixtures can now be routinely handled and sequenced through the use of capillary LC-MS/MS-based procedures, while the use of even smaller-diameter columns, such as 10-μm i.d. porous layer open tubular (PLOT) capillaries, can further enhance mass sensitivities [44]. However, sometimes even under the best analytical conditions, it may be difficult to measure many glycoproteins in enzymatic digests due to the mentioned ionization suppression phenomena. Perhaps a better solution to the problem resides with the use of enrichment methodologies such as the use of lectins (the application of which is discussed in greater detail in the next section) or immunoactive media prior to LC-MS/MS runs. These enrichment steps largely avoid competitive ionization through a targeted mixture simplification. The fractionation of complex protein mixtures can be accomplished by means of bioaffinity and the PTM (posttranslational modification) type, or to a lesser degree, other chromatographic and electrophoretic principles. However, it is desirable in so far as it does not compromise the reproducibility and quantification in comparative analyses.

The possibility of lectin immobilization on solid surfaces (particles, membranes, separatory channels, etc.) allows the utilization of these unique proteins in various glycoproteomic schemes, in which the trace glycoproteins can be effectively enriched and further processed through a common proteomic platform based on LC-MS/MS. One such analytical platform, which was developed [45] and further refined for label-free quantitative glycoproteomic profiling [4648] of blood serum samples in our laboratory is shown in Fig. 1. In this application, the experimental workflow fractionates the serum sample (usually 20–30 μL) through interaction with two types of bioaffinity media: first, the immunoaffinity capture of major serum proteins depletes highly abundant proteins from the sample to facilitate further proteomic investigations of trace glycoproteins; second, a lectin microcolumn is employed to further preconcentrate the glycoproteins of interest. Using an appropriate competitive hapten buffer, the glycoprotein fraction is desorbed for an additional fractionation through reversed-phase LC into about 30 fractions, which are, in turn, trypsinized and subjected to the usual LC-MS/MS tryptic mapping procedure, database search and evaluation. Following a statistical treatment of the generated data, such glycoproteomic surveys can identify groups or individual glycoproteins that are capable of discriminating between physiologically distinct states of health [4852]. Furthermore, glycoproteins that are identified as interesting biomarker candidates can be targeted for specific characterization in subsequent studies [5356].

Fig. 1
figure 1

Multimethodological glycoproteomic workflow. A complex biological sample is first depleted of highly abundant proteins, followed by lectin preconcentration of the glycoproteins. Next, the enriched glycoproteins are further fractionated by reversed-phase liquid chromatography (RPLC). Fractions are enzymatically digested, followed by bottom-up LC-MS/MS for protein identification/quantification. (From Reference [45])

Immunoaffinity-based chromatographic purification may offer a clear path to targeted MS profiling of glycans and glycopeptides from putative biomarkers in body fluids and tissues. This approach is currently somewhat limited by the availability of highly specific antibodies at a reasonable price and the lack of optimal solid supports for construction of miniaturized preconcentrators. As the range of available antibodies gradually expands, protein-targeted measurements may in time become more common. As a preliminary example of this approach, we demonstrate here the immunoisolation of α-1-acid glycoprotein (Fig. 2) from a small volume of human blood serum, followed by deglycosylation and a glycomic profile measurement (unpublished results). Another example in the recent literature includes isolation of haptoglobin [57], a serum protein with an increased abundance of fucosylation that has been implicated in pancreatic cancer [5759].

Fig. 2
figure 2

MALDI-TOF-MS of permethylated N-glycans enzymatically released from α-1-acid glycoprotein (AGP) that was purified from a 5-μL aliquot of human blood serum utilizing a murine monoclonal antibody (mAb). The serum had been previously depleted of seven highly abundant proteins. The mAb was incubated for 1 h with the depleted serum, and the mAb-AGP complex was then extracted from the mixture with anti-mouse IgG coupled to agarose beads. The beads were thoroughly washed to remove the unbound serum proteins, and the purified mAb-AGP was subsequently eluted from the anti-mouse beads with acetic acid, pH 2.6, prior to enzymatic release of N-glycans with PNGase F. We depict here different multiantennary glycans as cartoon structures, using the symbols established by the Consortium for Function Glycomics (http://www.functionalglycomics.org/static/consortium/Nomenclature.shtml)

Lectin affinity chromatography

Biospecific interactions of lectins with certain glycoproteins have long been documented in the histochemical, cytochemical, and biochemical literature, but much less applied in quantitative analytical procedures until recently. The choice of a particular lectin in a preconcentrator medium can be critical to the success of the entire glycoproteomic profiling. It has long been known that some lectins exhibit a degree of selectivity toward certain oligosaccharide structures (differently linked sialic acids, fucosylated glycans, mannose-rich structures, etc.) in contrast to the more “generic” concanavalin A (Con A). In a biomedically important application to screening glycoproteins in microliter volumes of blood sera, it has become feasible to observe, and potentially quantify, several hundred constituents in lectin-enriched fractions [45, 49, 50, 6062]. The number of identified/measured sample components can further be enhanced through the use of new high-resolution mass spectrometers such as LTQ-Fourier transform ion-cyclotron resonance (FT-ICR), Orbitrap, or Q-TOF instruments. In our studies of serum minor proteins [45], four common lectins were evaluated in terms of molecular size and pI ranges of the retained components. The results indicated that such lectins have both overlapping and selective properties. In order to generate “representative” profiles of a particular sample type, our laboratory [45] and others use different lectins in either a sequential-lectin arrangement or a multi-lectin (mixed-bed) arrangement [63], with some differences in performance of both techniques noted [60]. The use of silica-based lectin microcolumns has been beneficial, as it permits incorporation into a valve-based high-pressure analytical system [61].

Following the development of microscale lectin affinity techniques for enrichment of glycoproteins in biological materials, this approach has been the basis for a multitude of glycoproteomic investigations that aim to characterize the sub-glycoproteomes of a variety of biological materials derived from humans, including urine [64, 65], saliva [66], organ tissues [67], and, most frequently, blood serum [45, 48, 6771].

A less usual example of targeted glycoproteomic analysis through lectin enrichment can be found in our recent investigation of pancreatic cyst fluids [72]. The samples, which were collected by a fine needle aspiration of the cystic lesions intraoperatively to avoid peripheral contamination, were highly variable, with inconsistent coloring and viscosity, in addition to variable protein composition and total content. After a combination of filtration and buffer-exchanging steps were applied, relatively clear fluids were obtained for glycomic and glycoproteomic profiling. MS-based glycomic analysis showed many glycans that were observed in serum profiles, but, in a few of the fluids that were associated with a higher risk of malignant transformation, we have identified a number of hyperfucosylated glycans (unusual structures) containing from 2–6 fucose residues on a single structure (Fig. 3). Such glycans were not seen in the whole serum profiles. This study represents a less usual approach, i.e. glycomic profiling first, as a guide for subsequent glycoproteomic studies. Alternatively, most investigators conduct proteomic studies first, targeting glycosylation later.

Fig. 3
figure 3

Glycomic and glycoproteomic analysis of so-called “hyperfucosylated” glycoproteins in pancreatic cyst fluids. MALDI-TOF-MS of permethylated N-glycans a from m/z 1500–3250 and b from m/z 3250–5000. c A label-free quantitative comparison of the glycoproteins overexpressed in the hyperfucosylated fluids (G1) to the non-hyperfucosylated fluids (G2), based on whole-fluid proteomics (“No enrichment”) and lectin-enriched glycoproteomics with Aleuria aurantia lectin (AAL) (“AAL-enriched”), revealed the glycoproteins significantly overexpressed and, most importantly, those that were both overexpressed and highly enriched by AAL. (From Reference [72])

Following an untargeted proteomic analysis to provide a baseline information, a glycoproteomic profiling workflow was modified to include Aleuria aurantia lectin (AAL) for the identification of the glycoproteins that were hyperfucosylated. A label-free comparison of the non-enriched and AAL-enriched proteomic profiles, facilitated by ProteinQuant [47], identified several glycoproteins that were overexpressed. This included pancreatic α-amylase, triacylglycerol lipase, and elastase-3A, as the proteins in high abundance following AAL enrichment (Fig. 3c). This study illustrates the advantages of performing glycomic and glycoproteomic investigations in the same laboratory.

With a better understanding of how the lectin preconcentrators work as critical components of the overall analytical schemes, further advances in glycoproteomic profiling can hopefully be realized. For comparative studies, as needed in virtually all topical applications of medical glycobiology, it is essential to secure adequate quantitative reliability in every step of a glycoproteomic workflow. It is thus desirable to utilize small-scale formats for the lectin enrichment step to ensure a quantitative recovery of the enriched sample components. Due to the relatively weak interactions between most lectins and their target carbohydrate moieties (approximate Kd range: 10−4–10−7 M), the best enrichment support materials provide a very high accessible surface area, while also exhibiting a fast rate of mass transfer. Furthermore, a high lectin density can greatly improve the realized strength of the interaction with target glycoproteins through simultaneous interactions with multiple sites of glycosylation (multivalency) [73]. As such, monolithic columns are expected to be suitable for this type of work, but the current rapid development of various new materials may lead to the discovery of supports that offer their own unique advantages. As an example of these efforts, a novel particulate silica material (1.6 μm diameter) containing an extensive, sponge-like network of macropores has been utilized in our lab to reproducibly enrich important glycoproteins from a single microliter of whole blood serum or an equivalent amount of albumin- and IgG-depleted serum using Con A and AAL [74]

It is critical to ensure that the lectin preconcentration step does not become a bottleneck in the overall quantification procedure. While our recent data [46] with one lectin indicate that adequate analytical reproducibility could be achieved in label-free quantitative proteomics, rigorous standardization steps must be followed for all lectin-based procedures.

Profiling human immunoglobulins at high sensitivity

While many useful and now widely appreciated MS and computational methodologies in the field of proteomics were initially driven by the “global” or “total” approach to the complexity of protein and peptide mixtures, today’s investigators increasingly appreciate the value of more selective and targeted approaches to the complexity problem. One family of glycoproteins that is inherent to our bodily defense system and thus very interesting to characterize (in terms of glycosylation) are the immunoglobulins (Igs). As it has been shown already, the function of IgG can be reversed from pro-inflammatory to anti-inflammatory by addition of terminal N-acetylneuraminic acids [75]. Moreover, profiling of IgG glycans in a large-scale study has recently demonstrated that decreased galactosylation correlates directly to increasing age [76]. Through affinity chromatography in different formats (e.g., magnetic particles, agarose beads, monolithic phases, and silica particles), it is straightforward to extract the major isotype, IgG, using either Protein A or Protein G in their immobilized forms. However, it has remained challenging to enrich the less-abundant isotypes from complex mixtures such as blood serum.

A novel approach has recently been developed in our laboratory to address this need: a serial affinity chromatography strategy is employed to first capture IgG upstream of a Protein L affinity column that, through its unique binding action, may capture all Igs bearing kappa light chains (subtypes I, III, and IV) [77], which include all five classes of human Igs. Starting with 3 μL of blood serum per experiment, an initial characterization of the glycomic profile of the less-abundant Igs in serum has been measured using this approach [78], which is currently being applied to a larger comparative study of human cancers.

General glycoprotein/glycopeptide fractionation strategies

It is sometimes preferable to perform an indiscriminate preconcentration of all glycoconjugates in a mixture. For an initial glycoproteomic survey of an uncommon biological material, it can be valuable to measure the profile of the whole glycoproteome in order to guide subsequent investigations of interesting subglycoproteomes. Alternatively, the glycopeptides from a prefractionated/purified glycoprotein proteolytic digest may be captured (and thus isolated from non-glycopeptides) to greatly enhance their ionization in MS. Several strategies have been developed for general enrichment of glycoconjugates.

Hydrazide capture

A popular approach for the isolation of glycoconjugates is to use hydrazide-coated beads, as described by Aebersold and coworkers [79]. In general terms, the glycoprotein/glycopeptide capture results from a covalent bond formation between hydrazide groups on the surface of a support medium and aldehydes present on the carbohydrates, introduced through the oxidation of vicinal diols. The approach has been applied to glycoproteomic analysis of many complex materials, including saliva [80], plasma [81, 82], blood platelets [83], liver tissue [84], and T and B cells [85]. While this covalent capture strategy represents a very effective approach for select applications, it is unsuitable for direct measurement of glycan moieties, which cannot be readily recovered from the support material.

Boronic acid enrichment

Boronic acid-functionalized materials are emerging as attractive options for glycocapture, as a result of their unique ability to form reversible, covalent bonds with monosaccharides that exhibit vicinal diols [8688]. Microscale variations of this approach have been demonstrated for the enrichment of glycopeptides from standard glycoproteins [8991], though they have only rarely been applied to glycoproteomic studies of biologically interesting samples [92]. Because of their unique, universal “lectin-like” properties, boronic acids (sometimes even referred to as boronolectins) have also demonstrated potential for the enrichment of nonenzymatically glycated proteins [93] and peptides [94].

Metabolic labeling of glycans

A major limitation for chemical affinity enrichment is that carbohydrates contain few biologically unique functional groups that are not observed in other classes of biomolecules such as proteins or nucleic acids. Thus, chemical enrichment strategies for glycans may suffer from either weak specificity or, as is the case with the hydrazide capture, they may prove too harsh for the necessary glycan characterization in some applications. However, an interesting alternative is to add unique functional groups to glycoconjugates and thus provide additional possibilities for enrichment strategies. Utilizing the specific biosynthetic pathway of an organism, Laughlin and Bertozzi have demonstrated that it is feasible to incorporate azide-modified monosaccharides into glycoconjugates in vivo or ex vivo [95]. Through a further modification of the azide groups with a FLAG peptide [96], the metabolically labeled glycoconjugates can be immunoprecipitated from the sample mixture. This strategy can be applied for the general enrichment of glycoconjugates by incorporating azido analogs of conserved monosaccharides such as N-acetylglucosamine in mammals, or for enrichment of different classes of glycans with more specific residues such as azidogalactose or azidofucose.

Hydrophilic interaction chromatography

Hydrophilic interaction chromatography (HILIC) exploits polar interactions between analytes and a polar stationary phase, thus providing a more subtle option for separation of glycoconjugates than the previously described binary capture strategies. Numerous stationary phases have been employed for HILIC applications, including silanols, diols, amines, amides and various cationic, anionic, and zwitterionic functional groups. Among these alternatives, amines and amides have been used most frequently for the separation of glycosylated analytes. A recent report by Gilar et al. [97] demonstrates an efficient separation of a bovine fetuin digest, in which non-glycopeptides were eluted first, followed by (smaller) O-linked glycopeptides, and ending with the elution of N-linked glycopeptides. HILIC is one of the few high-resolution chromatography techniques that separates glycoconjugates through significant interaction with their carbohydrate moieties rather than other molecular attributes. In this regard, it has been used regularly for the analysis of fluorescently labeled glycans, as discussed further in the forthcoming text. Beyond the brief example discussed herein, HILIC-based separation of glycoconjugates has been utilized in a number of applications, and a more specialized discussion of its merits can be found in two recent reviews, dedicated solely to this topic [98, 99].

Glycopeptide analysis

Besides merely implicating certain glycoproteins as biologically or medically important, it is also essential to learn about the structural details of their glycosylation, such as the sites of glycosylation on the polypeptide backbone, possible structural variations (micro-heterogeneity) of a glycoforms at a particular site, or the accessibility for biochemical interactions in a three-dimensional environment. With the ubiquity of N-glycosylated and (particularly) the less explored O-glycosylated sites in many mammalian glycoproteins, this appears a “tall order” even for the best analytical tools and expertise at hand. Yet, this type of information appears crucial from the biomedical and pharmaceutical viewpoints. Mining the complex glycoproteomes of humans, animal models, and infectious agents at high measurement sensitivity thus appears essential for many emerging fields. Different strategies to accomplishing these tasks include various enzymatic treatments, site-labeling procedures, and covalent attachment to surfaces through the carbohydrate moieties, among others. Owing to its high sensitivity and ability to provide mass-based structural characterization, MS remains the key mode of glycopeptide detection. However, ionization is a requisite preliminary step for MS detection of an analyte, and the high complexity of enzymatic protein digests derived from biological materials creates a competitive environment for ionization. This is often a major hindrance for detection of glycopeptides that ionize poorly relative to nonglycopeptides. Competitive ion suppression calls for the fractionation of digest mixtures through chromatographic media first. Consequently, hydrophilic packings [97, 100], graphitized carbon [101], or lectins [49, 50, 6062] used prior to MS can substantially improve the site-of-glycosylation measurements.

For some applications, determining the site of glycosylation may be more important than performing a complete structural characterization of the oligosacharride. It is possible to use either chemical or enzymatic means to label the site of glycosylation, though the glycan itself may be lost in the process. An example is the cleavage of O-linked N-acetylglucosamine (O-GlcNAc) with β-elimination, followed by the Michael addition (BEMAD) of dithiothreitol [102]. Additionally, there are reports of the MS-based fragmentation strategies that appear to selectively fragment glycans after the first core GlcNAc of an N-glycan, such as the high-energy C-trap dissociation (HCD), which can be performed with the orbitrap MS instrument. Since this method can reportedly yield peptide fragment ions with the core GlcNAc residue still attached to the peptide [103], it is a simple matter to include this as a variable modification and thus identify the glycopeptide, including the site-of-glycosylation, through typical database-searching strategies for the bottom-up proteomic measurements.

The role of collision-induced dissociation in glycopeptide analysis

An important aspect of glycopeptide analysis is a comprehensive elucidation of its entire structure, including not only the amino-acid sequence of the peptide backbone, but also a detailed characterization of its carbohydrates and the site-of-glycosylation, since not all variable structures are necessarily occupied. Through an effective coupling of nanoflow-LC to tandem MS, hundreds of glycopeptides can quickly be characterized, with collision-induced dissociation (CID) being the most widely applied for the task. During this fragmentation technique, analytes are bombarded with an inert buffer gas, increasing their internal vibrational energy. Once enough energy is deposited to the molecule, fragmentation occurs. However, the energy barrier of dissociation for the carbohydrate part of a glycopeptide is lower than that for the peptide backbone, resulting in a spectrum, which is dominated by the ions originating from glycosidic bond cleavages, while the diagnostic peptide fragments are rarely observed. Thus, CID can be a useful technique to characterize an oligosaccharide while it is attached to a peptide. At the same time, the lack of fragmentation within the peptide backbone necessitates additional MS information, such as that provided by complementary fragmentation strategies (see the next section.) or sub-ppm mass accuracy of the precursor m/z of the glycopeptide.

Glycopeptide identification with CID and high mass accuracy

An example of the application of CID in combination with high-resolution MS data is included here to illustrate how CID can facilitate glycopeptide discovery in complex mixtures. In our recent study [19] of membrane-bound glycoproteins from the select agent, Francisella tularensis subsp. holarctica, the characterization of two glycopeptides from a novel virulence factor was achieved using high resolution LC-FT-ICR MS with in-source CID, frequently referred to as source-induced dissociation (SID). Following the enzymatic digestion, bottom-up proteomics was performed on the glycoprotein, FTH_0069, that had been isolated by 2-D gel electrophoresis. SID was used to monitor characteristic glycan oxonium ions, and, with a priori knowledge of the amino acid sequence and glycan structure, a list of theoretical glycopeptide accurate masses was compared to the observed glycopeptides to determine their identities, which was achieved exclusively through sub-ppm mass accuracy.

It is clear that high-resolution MS data are vital to this type of analysis, but the considerable preparative expertise that was necessary to isolate the target glycoproteins from bacterial cell lysates through density-based fractionation, liquid extraction of membrane-bound glycoproteins, and 2D-gel electrophoresis prior to the use of bottom-up proteomics was a prerequisite to the achieved ionization of the glycopeptides [19]. This work exemplifies the potential of multidimensional sample fractionation and separation techniques in combination with MS detection as a means to achieve a clearer understanding of individual, biologically interesting glycoproteins, emphasizing that the current analytical glycobiology often remains a multimethodological task, as was the case a decade ago [104].

Electron-transfer dissociation of glycopeptides

Alternative approaches to molecular fragmentation are the so-called electron-based methods of electron-transfer dissociation (ETD) [105, 106] and electron-capture dissociation (ECD) [107, 108]. While the exact mechanisms of cleavage may still be debatable, these methods are effective in a cleavage of the N-Cα amino acid bond. This approach to fragmentation does not increase a peptide’s vibrational energy, and it does not fragment the glycan portion of a glycopeptide. The method is often deemed as appropriate to determine the site-of-modification and has been successfully applied to N-linked glycopeptides [109, 110].

Using modern instrumentation that is capable of quickly alternating between the two fragmentation methods (on-the-fly) during a single LC-MS analysis, our laboratory demonstrated the effectiveness of combining the complementary data acquired from CID and ETD experiments to comprehensively characterize glycopeptides derived from a number of glycoprotein standards [111]. An example of this approach is shown in Fig. 4a–b. The CID spectrum acquired from a glycopeptide originating from haptoglobin (Fig. 4a) highlights the extensive fragmentation of the carbohydrate and its structural elucidation. Figure 4b presents the ETD spectrum of the same glycopeptide and shows significant fragmentation of the peptide backbone. In this spectrum, no fragments due to glycan bond cleavage were observed, while only a series of c- and z-type ions were present. Based on the mass difference between the c5 and c6 ions, which corresponds to the addition of an asparagine residue within the attached glycan, the site-of-modification was determined. The remaining fragments then indicated the amino-acid sequence of the peptide.

Fig. 4
figure 4

Combined tandem MS characterization of the 236VVLHPNYSQVDIGLIK251 glycopeptide from human haptoglobin. a Collision-induced dissociation (CID) preferentially fragments the biantennary, disialylated complex glycan, whereas b electron-transfer dissociation (ETD) provides complementary information by fragmenting the glycopeptide backbone, which provides the peptide sequence and elucidates the site-specific attachment of the glycan at Asn241. (From Reference [111])

One drawback to the electron-based fragmentation methods is an apparent m/z limitation. We were able to successfully and efficiently fragment a number of glycopeptides with m/z values of up to about 1,000. Above this value, nondissociative electron transfer seemed to dominate; this phenomenon was also observed by others [112, 113]. Since many glycopeptides result in multiply-charged ions with m/z values above this threshold, the basic approach to this fragmentation method needs to be modified. One approach that we investigated was “supercharging” [114] the electrospray ionization (ESI) process by adding small amounts of m-nitrobenzyl alcohol to the mobile-phase buffers. In these preliminary experiments, we observed increases in the charge states of several glycopeptides and improved ETD fragmentation.

Since it appears that noncovalent interactions hinder the dissociation of the generated fragments [112], the ETD methods have been adapted to include a gentle CID-type of activation [113, 115] increasing the number of diagnostic fragment ions for such peptides. In one example from the Karger laboratory [115], a glycopeptide generated by the Lys-C digestion of EGFR was detected as a +5 ion with an m/z value of 1142.73. The ETD fragmentation of this large glycopeptide allowed only 9 of the 36 amino acids to be determined. However, upon activation of a charge-reduced species generated during the ETD process, 20 amino acids were determined and the correct peptide sequence was determined through database searching.

Without modifications to the current instrumentation, ETD may be most effective for glycopeptides featuring smaller carbohydrates, such as N-linked structures attached to bacterial glycopeptides [116], or O-linked oligosaccharides with smaller chains. Indeed, several recent examples have shown the utility of this method for the characterization of this important class of glycopeptides. Unlike N-linked glycopeptides, the O-linked structures do not readily yield a consensus sequence to indicate the site-of-modification. For these determinations, ETD has proven to be a valuable tool, as a recent publication demonstrates for several highly-charged mucin-originated O-glycopeptides [117]. These types of proteins present unique analytical challenges, as they are typically heavily glycosylated with a high degree of site occupancy. Interestingly, this work showed a high degree of sequence coverage for the glycopeptides modified with neutral glycans, while those with sialylated structures tended to produce fewer fragments. Similarly, these electron-based approaches have been applied to the hinge-region O-glycopeptides of IgA1 to determine the sites-of-modification and to find which sites were favored for alterations in a galactose-deficient IgA1 protein [118]. Another investigation utilizing ETD revealed that a significant decrease in the levels of N-acetylglucosamine attached to IgA1 O-glycopeptides was observed in patients diagnosed with rheumatoid arthritis [119], a change most commonly associated with IgG. Additionally, a combination of CID and ETD has been used to determine the glycans attached to three sites of O-glycosylation of an amyloid precursor protein secreted by CHO cells [120].

Glycomics

Glycomics is a less explicitly recognized member of the group of the “omics” methodologies. Its increasing popularity has been primarily based on the perception of numerous investigators that the glycan components are often the crucial functional determinants of biological events. Pending further methodological developments and improvements in instrumentation, glycomics is rapidly positioning itself to become a very fertile field in addressing some key questions of modern biology and medicine.

While direct glycomic measurements would seem at the first glance to be somewhat limited in that the information about the integrated function of glycoproteins is being lost though the deglycosylation step, such measurements have a certain practical appeal in that: (a) oligosaccharides are often the crucial functional elements in cellular and biomolecular interactions; (b) glycomic profiling techniques are inherently faster and methodologically easier to multiplex than the currently available proteomic approaches; and (c) the dynamic concentration ranges for glycan measurements appear to be much narrower than those for proteins in biological fluids and tissues. However, we do not know as yet what are the limits for glycans’ physiologically meaningful concentrations, and how to measure glycans at very trace levels.

Glycan release procedures

The key approach of glycomics involves the deglycosylation of isolated glycoproteins or entire complex glycoprotein mixtures extracted from biological materials to yield a representative array of oligosaccharides (glycans), which is subsequently displayed as a “glycomic profile” or “glycomic map” through a suitable bioanalytical technique. A quantitative and reproducible release of oligosaccharides from glycoproteins has always been a significant and difficult issue in glycobiology. It has gained an even greater importance in the high-sensitivity requirements of today’s glycomic profiling. The earlier used chemical release approaches, such as hydrazinolysis or the classical β-elimination in an alkaline medium, have now mostly been replaced by the more gentle enzymatic deglycosylation (use of N-glycanases) for asparagine-linked glycans [121, 122] or microscale chemical release procedures [123125] for threonine/serine-linked oligosaccharides. It is now generally agreed that N-glycans are “easier” to analyze than O-glycans, largely due to the availability of peptide-N-glycosidase F (PNGase F) and other glycanases, which reliably cleave a broad range of substrates, regardless of their glycan substitution, with only a few exceptions noted. Alternatively, there are additional approaches to glycan release that may need further optimization.

The release of N-linked glycans has traditionally been performed during an extended period of time, oftentimes requiring up to 24 h to obtain the highest possible digestion efficiency. However, for future large-scale studies of hundreds, if not thousands, of samples, the throughput of the release procedure needs to be increased. One interesting approach to reduce the incubation time involves the use of ultra-high pressure cycling, which utilizes pressures of up to 30 kpsi [126]. Under these conditions, the activity of PNGase F appears to be unaffected, while many glycoproteins become sufficiently denatured for an efficient deglycosylation in as little as 20 min [126]. Similarly to proteomics, a microwave-assisted method to enzymatically cleave glycans from their proteins has also been reported, with a complete removal of the carbohydrates from monoclonal antibody therapeutics being achieved in as little as 10 min, and up to 1 h, for other glycoprotein standards [127, 128].

As opposed to the N-linked structures, a universal enzyme is not available for the comprehensive removal of all O-linked oligosaccharides. Recently, we have explored the use of Pronase, a mixture of several proteases from Streptomyces griseus, as the way to digest glycoproteins to single amino acid residues. An efficient Pronase digestion leaves the O-linked carbohydrates attached only to their serine or threonine residues, which are then removed through a β-elimination process during their subsequent permethylation [129]. When compared directly to the samples prepared through other O-glycan release methods, the matrix-assisted laser desorption/ionization (MALDI) time-of-flight (TOF) MS signals associated with the Pronase-digested samples were typically 10–20 times more intense. As a demonstration of the effectiveness of this approach, Fig. 5 shows the O-glycan profile (acquired through MALDI-TOF MS) from a 1-μg aliquot of a large glycoprotein, bile-salt-stimulated lipase (BSSL), a heavily O-glycosylated protein isolated from human breast milk. In total, 75 oligosaccharides were detected, as listed in Table 1, and the overall improved sensitivity of the enzymatic/chemical cleavage procedure allowed for the detection of 40 unique structures that were not previously identified using our earlier O-glycan release methods [130132].

Fig. 5
figure 5

O-linked glycans released by a combined enzymatic/chemical method derived from a 1-μg aliquot of bile salt-stimulated lipase (BSSL). (From Reference [129])

Table 1 O-linked oligosaccharides identified from a 1-μg aliquot of bile salt- stimulated lipase (BSSL). The structures in red were unique carbohydrates identified via the enzymatic/chemical release procedure

Pronase was also used to obtain N-linked carbohydrates from bacterial glycoproteins [14], a system for which there was no other suitable deglycosylation enzyme available. Additionally, Pronase has also been suggested as an alternative enzyme to PNGase F [133] for the analysis of N-linked glycans. However, when Pronase is applied to the glycoconjugates, a complete degradation of the proteins to single amino acids is a key operation that may require incubation periods of up to 48 h. Fortunately, the digestion times have been significantly reduced using a Pronase immobilized on solid supports [134, 135], allowing this procedure to be coupled on-line for direct LC-MS analyses [134].

Carbohydrate permethylation

Following the glycan release, the carbohydrate patterns can be displayed by any of the available analytical techniques, including capillary electrophoresis (CE), liquid chromatography (LC) (both discussed later), or our currently preferred method, MS. The use of MS allows the oligosaccharides to not only be accurately quantitated, but it also enables a more definitive understanding of the exact structures whose abundances are often altered in disease states. While we have explored the possibility of using nano-LC coupled to ESI-based instruments [136], with both ion-trap and FT-ICR mass spectrometers, our currently preferred method is MALDI-TOF MS due to its speed of data acquisition, the ability to control the data collection (i.e. collect more laser shots per spot to improve the signal-to-noise ratio for improved sensitivity of low-abundance analytes), and the ability to perform high-energy CID tandem MS analyses. This is particularly valuable in characterizing some isomeric glycans [137].

To improve the overall MS performance, our glycomic platform is based on permethylating all analytes, which involves a modification converting the hydroxyl groups present on the carbohydrate to methoxide moieties, esterifying the carboxylate of any sialic acid residues present, and introducing a methyl group to the nitrogen of the N-acetyl groups of N-acetylglucosamine (GlcNAc) or N-acetylgalactosamine (GalNAc). Such a derivatization offers several advantages, including (i) an improved sensitivity of 10 to 20 times over the native glycans; (ii) converting the acidic structures (i.e. sialylated glycans) to neutral solutes, permitting a complete glycan profile to be monitored through the positive-ion mode; (iii) enhanced cross-ring fragmentation, which enables a more definitive structural characterization; and (iv) making the resulting glycans fairly hydrophobic, permitting their separation by reversed-phase LC, if needed.

The present-day permethylation procedure employs methyl iodide and dimethylsulfoxide (DMSO). This approach can be traced back to the 1960s [138] and its later updated features [139, 140]. More recently, we have introduced procedures using sodium hydroxide beads loaded into spin-column “reactors” [141, 142] rather than using a slurry of the reagent. By performing the reaction in spin-columns, the excess sodium hydroxide can be easily removed from the reaction solution, minimizing the so-called “peeling” reactions that can occur at the high pH values typically encountered during the sample recovery in the slurry-based methods. These undesirable degradation reactions often significantly limit the sensitivity of the slurry-based approaches for trace-level oligosaccharides. We have further refined our protocol to use dimethylformamide (DMF) in the place of DMSO [136], since this solvent participates in two adverse side reactions, which negatively influence the overall sensitivity and introduce uncertainty into the quantitation. One of these reactions ultimately produces a series of satellite peaks separated by +30 Da [143], while the second reaction results in the regeneration of their native “closed-ring” configurations of the glycan alditols [140]. Our laboratory prefers these types of structures since they do not have a potentially reactive aldehyde group at their reducing ends. When our most up-to-date protocol (typically resulting in the detection of a single derivatized analyte) is applied to microliter volumes of blood serum, we obtain reliable profiles such as the one shown in Fig. 6. In this profile of a women diagnosed with late-stage recurrent ovarian cancer, about 55 unique m/z values were detected, which correlate to the known glycan structures. Of particular interest to our laboratory are the structures present in the higher mass region of the spectrum, which are highlighted in the inset for this figure. This particular spectral area is populated with the tri- and tetra-antennary structures with varying levels of fucosylation and sialylation. We have proposed these types of analytes as the key structures in several pathological conditions, including breast [136, 144], ovarian [145], lung [146], liver [147, 148], prostate [149], and esophageal [150] cancers.

Fig. 6
figure 6

N-linked glycomic profile of an ovarian cancer patient. The inset highlights the high-mass region where many important trace-level oligosaccharides are located. (From Reference [145])

The analysis of permethylated oligosaccharides has also been successfully applied to those structures modified with phosphate or sulfate groups. Both of these moieties seem to be stable throughout the permethylation procedure, with the phosphate group becoming singly or doubly esterified [151, 152]. Most probably, extended reaction times ensure a complete esterification of the phosphate group. Conversely, sulfate groups attached to a carbohydrate are unaffected by the permethylation procedure and retain their negative charge [152, 153]. To detect sulfated glycans in a mass spectrometer’s positive ion mode, our group has developed a “double-permethylation” procedure [153]. In this procedure, the sulfated glycans are subjected to our spin-column approach for permethylation and recovered from the reaction mixture. The sulfate group is then chemically removed via a treatment with acidified methanol and the samples are permethylated a second time using deuterated methyl iodide to label the site of sulfation [153]. Additionally, following the first permethylation step, sulfated glycans may be fractionated based on their degree of sulfation only, since sialic acids are rendered as neutral, by strong-anion chromatography [154]. Following a desalting procedure, the sulfate group is chemically removed and the site-of-sulfation is repermethylated using deuterated methyl iodide.

Quantitation of oligosaccharides through stable isotope labeling

Based on the premise that there are many suspected or proven associations of human disease conditions with aberrant glycosylation, the rapid comparative profiling of structurally known, or at least tentatively identified, glycans could be significant as the starting point for more in-depth investigations of these diseases. Comparative glycan profiling can similarly be applied to a number of biological studies of any “normal” or “perturbed” systems, a comparison of glycosylation in different body organs, chemotaxonomies of different organisms, phylogenetic trees, etc. In all of these studies, high precision and accuracy in measuring glycan abundances for some or all profile constituents becomes essential. Here, the use of isotopic labeling for glycans and MS measurements opens new possibilities. It provides an approach in which multiple samples can be measured simultaneously and directly compared during a single data acquisition. Through the use of methyl iodide with varying deuterium substitutions, up to four samples can be simultaneously monitored. Our research group has incorporated isotopic labeling into the permethylation platform [155]. Importantly, the linearity of this method was acceptable at nearly two orders of magnitude, so that it could be applied to various differential glycomic studies. An example of this technique is shown as Fig. 7, with a MALDI-MS profile comparing the different expression levels of N-linked glycans in the early developmental stages of Drosophila melanogaster (unpublished results). This study indicated that several high-mannose type structures are much more abundant in the larval stage (shown as the red m/z values) than in the embryonic state (indicated by the green m/z values). In a different research group, 13C-labeled methyl iodide has been used to incorporate stable isotopes into glycan structures through permethylation [156], in which isobaric labeling has been achieved through the use of 13CH3I and 12CDH2 [157]. However, this approach introduces a mass difference of only 0.002922 Da for each site of derivatization. This small mass difference is difficult to detect by modern MALDI-based instruments, but it can be easily measured with a high-resolution mass spectrometer (i.e. an FT-ICR instrument or an orbitrap).

Fig. 7
figure 7

N-linked glycans, isotopically-labeled through permethylation, comparing different developmental stages of Drosophila melanogaster. The m/z values in red are associated with the larval stage, while those in green indicate an embryonic state

Isotopically-coded tags may further be introduced into the carbohydrate structure through other methods and locations on the glycan structure. The free reducing end of an oligosaccharide provides a convenient site for modification and several isotopically-coded chromophores can be incorporated at this location, including aniline [158160], 2-aminopyridine [161], 2-aminobenzoic acid [162], and 1-phenyl-3-methyl-5-pyrazolone [163]. Additional tags have been synthesized, including (13C6 and 12C12) 4-phenethyl-benzohydrazide [164], a hydrophobic tag that may enhance the sensitivity of ESI-based measurements through a more efficient desolvation process [165, 166], and a novel set of tetraplexed tags [167, 168], each separated by 4 Da and analyzed by a direct infusion into an ESI-based q-TOF MS instrument. Alternatively, in a closely-related analogue to the stable-isotope labeling by the amino acids in a cell culture (the so-called SILAC method, which is widely employed in the proteomics field), isotopically-labeled glutamine, which is further used as the sole source of nitrogen for the synthesis of N-acetylglucosamine, N-acetylgalactosamine and the sialic acids, has been reported [169] and utilized in conjunction with cultured mouse embryonic stem cells.

Analysis of isomeric structures

To what extent is the precise knowledge of a glycan structure necessary for understanding a biological phenomenon? The answer is that we do not know, but if one considers the molecular nature of sugar-sugar interactions or different conformational situations in a glycocalyx during protein binding processes, the geometrical considerations appear important at once. An excellent example for the need of glycan diversity is the regulation of our native and adaptive immune responsiveness [170, 171] to pathogens, allergens, and other foreign substances.

Glycomic in-depth studies can amount to very substantial scientific activities in the exact structural elucidation of the individual glycans in a profile. However, some structural ambiguities frequently arise. The exact positions of terminating sialic acids of incompletely sialylated structures are seldom known, as is the location of the biologically important fucosyl substitution. The differences in linkages between various monosaccharide residues are currently known for only some of the most abundant natural glycoproteins that have been extensively studied through the entire set of structural tools. The use of the multiple levels of tandem MS has been helpful in assigning the more definitive structures for oligosaccharides [172174], but the sample quantities used in such experiments can be substantial. Additionally, the more structurally informative techniques, such as NMR spectrometry, are currently somewhat insensitive for a number of biological investigations.

The propensity of glycans to form many different isomers represents one of the major challenges in glycomic measurements. MS alone cannot distinguish glycan isomeric forms, which inherently have the same mass, although high-energy collision-induced dissociation (CID) in some tandem MS techniques seems to promote the cross-ring fragmentation cleavages (across the sugar pyranose rings) that are needed to form more informative ionic fragments [175, 176]. Additional approaches in distinguishing sialyl linkage isomers may utilize the different reactivity of α2-3-linked and α2-6-linked residues, as demonstrated in our recent study [177]. While a previous study demonstrated the ability to discern the different linkage isomers of sialic acids through an esterification reaction using methanol [178], our permethylation-based platform required a major modification. Since this derivatization also results in an esterification of the carboxylate group of sialic acids, we modified an amidation reaction [179] to meet the structural elucidation needs. In this modification, the α2-6-linked sialic acids become amidated, while the α2-3-attached residues are lactonized, becoming insensitive towards derivatization. During a subsequent permethylation, the nitrogen of the amide groups accepts two methyl groups, while the lactone is cleaved and the resulting carboxylate becomes esterified, introducing a mass difference of 13 Da.

We have applied this linkage-specific amidation technique to several different cancer investigations and it appears that for the same glycans in different pathological conditions, a certain level of specificity may be associated with the different ratios of the linkage isomers, as shown in Fig. 8 . This figure compares the ratios of the different linkage isomers of a fucosylated, triantennary trisialylated glycan from our recent study of lung cancer (Fig. 8a) and the effects of smoking on glycomic profiles [146] and a prostate cancer study (unpublished results) (Fig. 8b). This figure demonstrates that the levels of the isomers containing three and two α2-3-linked sialic acids are decreased in their abundances in control individuals who are former smokers and the lung cancer patients who had formerly smoked [146]. However, these same isomers appeared to be differently altered in their levels in patients diagnosed with prostate cancer (unpublished results). The isomer with two α2-6-linked sialic acids also showed opposing trends. In the lung cancer study [146], this isomer was elevated, while in the prostate cancer patients, the abundance of this isomer was suppressed, along with the isomer containing three α2,6-linked sialic acids (unpublished results). Unfortunately, one of the isomers was observed only at very low levels, oftentimes not detectable in the lung cancer study, making a definitive conclusion about its ratios difficult at the present time [146].

Fig. 8
figure 8

Notched-box plots comparing the different ratios of sialic acid linkages of a fucosylated triantennary trisialylated glycan in a a lung cancer patients (from Reference [146]) and b prostate cancer patients

Another structural diversity may be biosynthetically introduced into a carbohydrate structure through the inclusion of a fucose residue, which is a monosaccharide residing on either the core GlcNAc connecting an N-linked carbohydrate to its protein, or on one of the “branching” GlcNAc units. Altered abundances in one of the possible locations are often associated with diseases and inflammation [8], and due to their isomeric nature, they cannot be readily discerned by MS alone. However, through a controlled enzymatic cleavage of the glycan structure with a nonspecific sialidase and a β-galactosidase, a mass difference can be introduced, as the outer-arm fucose residues inhibit the activity of β-galactosidases, when used as a specific reagent, leaving the galactose associated with its GlcNAc. Consequently, this presents a convenient method to monitor the ratios of core-to-outer-arm fucosylation. We have applied this approach in several recent cancer studies and in a preliminary observation, it seems that breast (unpublished results), ovarian [145], and lung [146] cancers show increased levels of outer-arm fucosylation, while liver cancer favors elevated levels of core-fucosylated glycans (unpublished results). This type of enzymatic digestion has also been performed by another group in combination with an LC-based methodology, in which the two former isomers will now result in different retention times after the exoglycosidase digestion [180183]. Their conclusions largely match our MS results.

Some promise in the sugar isomer resolution is also held by ion-mobility spectrometry (IMS), particularly in combination with time-of-flight (TOF) MS, in which the latter could structurally confirm the success of IMS separation [184186]. Interestingly, the observations made a number of years ago that the CE of fluorescently-tagged oligosaccharides can be quite effective in resolving at least some glycan isomers [187, 188] remain topical to this date. The glycosidase-assisted micro-digestions of biological samples, in conjunction with CE, can be of some help in structural elucidation through the peak shift observations. However, this could hardly become a widely used strategy for extensive characterizations, since the digestions often become unreliable for trace-level samples and some exoglycosidases are quite expensive.

Separations in glycomics

Due to the usual high complexity of the released glycan mixtures and frequently, the low abundances of some biologically-important glycoconjugates, the high sensitivity of analytical measurements becomes mandatory. In addition to the inherently sensitive MS techniques as detailed above, carbohydrate profiles may also be recorded by LC, CE, or the “chip-based” electrophoretic methods.

Liquid chromatography

Due to the hydrophilic nature of glycans, the separation systems using a normal-phase mode of chromatography or hydrophilic-interaction chromatography (HILIC), a term first used by Alpert in 1990 [189], are being increasingly employed. In recent years, several technological innovations have been realized, presenting new opportunities for improved separations of carbohydrates. Among the most interesting developments is the use of ultra-high pressure LC (UPLC) systems, which facilitate enhanced separation efficiencies and reduced analysis times due to a substantial decrease of the particle size. In the area of analytical glycobiology, this has been demonstrated in a head-to-head comparison of 3.0 μm and 1.7 μm sorbent particles [190] for 2-aminobenzamide-labeled glycans originating from bovine fetuin (Fig. 9). In this comparison, the UPLC analyses were faster, and due to the higher column efficiencies, were able to resolve several isomers that were not observed with the column packed with the larger particles.

Fig. 9
figure 9

Fetuin-derived glycans separated by a LC column packed with 3 μm particles and b UPLC column using 1.7-μm sorbent. (From Reference [190])

Another interesting development is the use of zwitterion HILIC or Zic-HILIC phases for glycoconjugate analysis [191]. Repeated runs of human IgG-derived glycans demonstrated a good reproducibility along with a high resolution between several isomeric structures. Additionally, glycopeptides differing only by an isomeric glycan structure were reported to be resolved with this stationary phase.

An alternative medium to the normal-phase packings is graphitized carbon, a stationary phase that offers several desirable features, including good chemical inertness, stability across a very extensive pH range, and the ability to maintain its integrity at elevated temperatures. Of particular importance is the high selectivity of graphitized carbon, which allows for resolution of the positional isomers of permethylated glycans [192] and various linkage [193] and positional isomers of native glycans [194]. Although the mechanism(s) of retention with this material are complex, the separations are likely to be based on adsorption through hydrophobic, hydrophilic, and polar interactions [195]. Not surprisingly, several parameters have been shown to influence the retention of glycans, including the ionic strength (to which sialylated structures appeared to be particularly sensitive), as increasing the salt concentrations in the mobile phases resulted in increased retention times and broadened peak widths [195]. The retention of these oligosaccharides also appears to be dependent on the mobile phase pH, with longer retention time being associated with lower pH values [195]. Interestingly, neutral oligosaccharides seem largely unaffected by changes in the mobile-phase composition [195]. When compared to other retention mechanisms, which are based on partitioning, the retention of carbohydrates on graphitized carbon packings is increased at elevated temperatures [195].

A fairly recent and convenient innovation available to glycobiology researchers is the “LC chip” [196, 197]. In this miniaturized analytical format, a trapping column, switching valve, and an LC column have all been integrated into a single biocompatible unit, which reduces the number of manual connections and allows for easy measurements as a specialized “MS inlet.” We have demonstrated the applicability of this approach for glycomic analyses through the separation of permethylated glycans using a C18 chip interfaced directly to an ESI MS [136]. This combination provided complementary results to a previous MALDI-based glycomic study of blood serum samples from breast cancer patients [144]. Among the advantages of using ESI were the high reproducibility and repeatability of the analytical signal. Other groups have recently applied this chip-based approach to the analysis of the neutral and acidic oligosaccharides isolated from human breast milk [198200] using the graphite chip packings. Using the graphite chips, the Lebrilla group also reported that nearly 200 structures were identified using exact masses in human serum [201], which included some isomeric analyses [202].

While the previous chip studies were conducted on glycans that had been released by a typical “in-solution” procedure, the more recent chip designs now include an integrated “PNGase F reactor”, which allows for the glycan cleavage to be performed on-line [203]. Using such a chip type, within only a 6-s incubation period, approximately 98 % of an antibody was deglycosylated [203], indicating significant improvements in routine glycomic analyses. Following the cleavage, the glycans were trapped on a graphite preconcentrator and subsequently separated using a column packed with the same adsorbent.

Capillary electrophoresis

In the search for glycomic profiling techniques of high sensitivity, the development of capillary electrophoresis/laser-induced fluorescence (CE-LIF) for derivatized glycans in the early 1990s [22, 23, 204] preceded most efforts of using biomolecular MS in the area. While, in the following years, the use of MS-based methodologies dramatically increased in numbers and specific applications due to the possibilities of direct, positive identification of glycans from their MS “molecular fingerprints”, it is remarkable that the applications of CE-LIF still continue to develop at their very respectable pace. While CE-LIF is capable of recording reproducibly very complex glycan profiles from hydrolyzed glycoproteins and glycoprotein mixtures at very high sensitivity, this approach generally lacks the solute identification capabilities that the MS and tandem MS/MS so easily offer. This general drawback notwithstanding, new applications, developmental trends, and hardware developments are primarily driven by the relative instrumental simplicity of CE-LIF, miniaturized automatable designs needed in biotechnology industry and other routine measurements, and not at the least by the capability of CE and related techniques for resolving isomeric glycans that MS generally lacks.

The sample preparation aspects and a correct choice of the fluorescence tag are still the subject of different communications in the current literature. While different fluorescence-labeling reagents were explored in the early work on CE-LIF of carbohydrates, the introduction of 8-aminopyrene-1,3,6-trisulfonic acid (APTS) as the labeling reagent by Guttman and co-workers [25, 205] has led to its nearly uniform acceptance and commercial use. The basis of this reagent’s utility is the reaction between its aromatic moiety and the reducing end of carbohydrates, followed by a reducing stabilization of the analytes. As the recent examples of other fluorescence-labeling approaches, 4-fluoro-7-nitro-2,1,3-benzoxdiazole [206] and rhodamine 110 dye with a large fluorescence quantum yield [207] were also investigated. Each derivatization approach and the correspondingly modified glycans necessitate their own procedural modifications and distinct compositions of buffers and separation media.

Since numerous new biopharmaceuticals, including monoclonal antibodies and vaccines, feature glycosylated structures, CE-LIF is rapidly becoming the method of choice in providing quantitative glycosylation profiles to support product efficacy and minimize immunogenicity effects [208, 209] in the biotechnology industry. In contrast to profiling glycans in the biomarker discovery area, most biotechnologically-oriented applications place less stringent demands on sensitivity and identification of the unknown components. However, the demands on sample throughput can be substantial, and the same is perspectively true in clinical analyses, as evidenced through the recent CE-based equipment to evaluate glycan profiles of patients with liver diseases [210, 211]. As seen in Fig. 10, an optimized series of sample (3 μL of a patient’s serum) treatments, leading to APTS labeling and, ultimately to CE-fluorescence glycan profiling, can now be performed for the benefit of clinical diagnosis while using a DNA sequencer equipment.

Fig. 10
figure 10

Optimized workflow for the preparation of glycan-containing samples prior to a capillary electrophoretic analysis. (From Reference [211])

While the capillary-based glycan separations of biomedical interest have been amply demonstrated in the literature, there is a general trend toward the use of chip-based separatory systems [212216], which offer substantial gains in terms of faster and more reproducible separations and measurements. As an early example of this trend [213], we show the separation of glycans extracted from a blood serum sample of a breast cancer patient (Fig. 11). In contrast to the separation in a fused silica capillary, which has taken over a 30-min. period, the demonstrated run using a spiral channel design has taken only 2.8 mins of analysis time at comparable or better separation efficiency. More recently, even more efficient separations were achieved with similar biological samples using a chip with a more advanced serpentine design [217].

Fig. 11
figure 11

Chip-based electropherogram of N-linked glycans from the serum of a patient with late-stage breast cancer. (From Reference [213])

The current problem of the CE analysis of glycans remains its limited compatibility with MS. To perform at its best separation efficiency, capillary zone electrophoresis (CZE) tolerates only minute quantities of the analyzed materials at the inlet of the separation capillary, and the same is true for its microchip versions. This, in turn, means that most on-column preconcentration remedies (e.g., stacking or solute trapping) are not particularly helpful to enhance the signals in different CE-MS combinations. The considerably more “MS-compatible” is capillary electrochromatography (CEC), which still remains one of the few viable options for analyzing complex glycan mixtures [130, 131] with the use of polymeric monolithic columns. We have investigated the merits of CEC-MS coupling using the ESI inlet to an ion trap [131, 218] and ICR MS [130] and, through a sample microdeposition device, to a MALDI-TOF instrument [132]. More recently, an elaborate instrumental arrangement involving CE-LIF-MS coupling was reported for a successful analysis of recombinant monoclonal antibody glycans as APTS-labeled analytes [219].

A key advantage of CE in glycan analysis has been its capability to resolve isomers. This is exemplified with the glycan profile of a monoclonal antibody compared to several glycan standards (Fig. 12) [220]. Recombinant glycoproteins and similar cases of more “predictable” glycan biological sources have been significantly aided by the favorable analytical attributes of CE. However, the lack of glycan standards has far more seriously impaired the use of CE in the biomarker discovery area than perhaps with any other analytical method.

Fig. 12
figure 12

Capillary electropherogram depicting the resolution of some isomeric glycans derived from a murine monoclonal antibody. (From Reference [220])

The use of sequential digestion with exoglycosidase enzymes, followed by the CE analyses [221, 222] can be useful in structural assignments and glycan mapping with relatively simple glycoproteins, but less effective with biologically complex systems. Additionally, exoglycosidases are relatively expensive reagents. A recent innovative approach to decrease consumption of the exoglycosidase enzymes together with enhanced analytical performance appears to be the phospholipid-assisted CE [223, 224], where the phospholipid additives are used in a segmentation process to incorporate these enzymes. An example is outlined in Fig. 13: multiple enzymes have been used, sequentially, for selective incubation periods inside the CE capillary, to cleave the APTS-labeled glycans from a recombinant glycoprotein cancer drug [223].

Fig. 13
figure 13

Capillary electropherograms demonstrating an on-line exoglycosidase digestion procedure for carbohydrate structural analysis. (From Reference [224])

Further advances in CE-MS of glycoconjugates are clearly desirable for further development of the field of analytical glycobiology. Ironically, the best up-to-date CE separations have been achieved with the buffer media and polymeric additives, which are largely incompatible with typical MS conditions. It is currently hard to predict whether the desired improvements will come from the use of different derivatization schemes and separation conditions, any breakthrough developments in the CE-MS interfacing technologies, MS designs, or combined incremental improvements in all these areas. Various advances in CE-MS of glycoconjugates have been the subject of recent reviews [216, 225, 226].

Conclusions

The field of analytical glycoscience has evolved substantially during the last decade. The major incentives for this methodological progress in the field is the close connection of glycoconjugates with some of the most important fields of human activities and scientific endeavors: the search for disease biomarkers; recombinant glycoprotein pharmaceuticals; developmental biology; microbiology; immunology; plant biology; biofuels, among others. This brief authoritative review has summarized only a small cross-section of the vast field affected during the last 20 years through new analytical techniques and instrumentation.

Glycoproteins are methodologically unique among the different classes of glycoconjugates, so that the interlinked fields of glycomics and glycoproteomics necessitate a different emphasis on how the biological samples are fractionated, enzymatically or chemically treated, and analyzed. The emphasis on very high measurement sensitivity, which is clearly mandated by the very large dynamic concentration range in which different glycoproteins occur in biological samples, favors the most technologically advanced forms of instrumental methods such as MS, miniaturized LC and CE-LIF. The complementary nature of these analytical approaches in glycomic and glycoproteomic measurements will be essential in future successful investigations, as will be further technological improvements and effective coupling (or “hyphenation”) of these analytical techniques. Substantial advances in computer-aided evaluation of the highly complex analytical data present unprecedented opportunities for future explorations of the mysterious problems of the glycome and glycoproteome.