Keywords

5.1 Introduction

Glycosylation is required for all life forms and abundant, with over 50% of mammalian proteins being glycosylated (Apweiler et al. 1999). In comparison to other major biomolecules such as proteins and DNA, the biological roles of carbohydrates remain poorly understood. Carbohydrates, being complex and remarkably diverse in nature, are arduous to synthesize, characterize, and analyze. However, over the past decade, with the advent of new technologies, experimental techniques, and instrumentation, analysis of glycans and glycoproteins, also formally known as ‘glycomics’ and ‘glycoproteomics’, respectively, has gained momentum.

Proteoglycans (PGs) and glycosaminoglycans (GAGs) are ubiquitous components of the ECM and play essential roles in all areas of physiology, including cell signaling, cell adhesion, and cell functions (Afratis et al. 2012). PGs are composed of core proteins to which GAG chains are attached. GAGs are linear polysaccharides consisting of repeating disaccharide units of hexosamine (N-acetylglucosamine or N-acetylgalactosamine), and hexuronic acid (glucuronic acid or iduronic acid) that are covalently attached. GAGs can be divided into categories based on the repeated disaccharide unit, i.e., heparan sulfate (HS), chondroitin sulfate (CS), dermatan sulfate (DS), keratan sulfate (KS) and hyaluronic acid (or hyaluronan) (HA) (Table 5.1) (Sethi and Zaia 2017). They are heterogeneous concerning chain length and subsequent modifications, including sulfation, acetylation, and uronic acid epimerization of disaccharide units. GAG structure is spatially and temporally regulated and plays specific and distinct functional roles during development and disease onset (Iozzo and Schaefer 2015). Unlike other GAGs, HA does not contain sulfate and is not bound to a core protein; rather it exists as a molecular backbone for extracellular matrix complexes consisting of glycoproteins, proteoglycans, collagens and other interacting molecules.

Table 5.1 Type of glycosaminoglycan (GAG), and its repeating disaccharide unit

The ECM is a complex molecular network that surrounds all cells, occupying approximately a 20% volume fraction of the adult brain (Sykova and Nicholson 2008). Its main components include hyaluronan, proteoglycans, glycoproteins, and a variety of posttranslational remodeling proteases, such as matrix metalloproteinases (MMPs), which cleave ECM molecules, allowing for highly dynamic functional adaptations (Muir et al. 2002; Rivera et al. 2010). Organized forms of ECM, namely perineuronal nets (PNNs), composed of hyaluronan, proteoglycans, glycoproteins, and collagen, surround the synapse and interact with cell surface receptors. In pathologies, including cancers, cardiovascular diseases, fibrosis, neurodevelopmental and neuropsychiatric diseases, ECM structure and function becomes dysregulated. Thus, characterizing the ECM structure is central to the understanding of physiology and pathophysiology in many diseases (Raghunathan et al. 2019a). The matrisome is defined as the supramolecular complexes, consisting of proteoglycans, glycoproteins, collagens, and hyaluronan, that form the functional units of the ECM (Martin et al. 1984), including the associated molecules. According to the matrisome project, the core matrisome consists of 195 glycoproteins, 44 collagens, and 35 proteoglycans (Shao et al. 2019).

Large scale proteomics studies have quantified, and cataloged expression patterns of various ECM and associated proteins (Byron et al. 2013; Chang et al. 2016; Goddard et al. 2016; Hill et al. 2015; Lindsey et al. 2016; Naba et al. 2012, 2015, 2017), but have not defined the glycosylation patterns of these proteins. It is essential to profile the glycosylation of the matrisome molecules to understand its structural, functional, and biological role in critical molecular mechanisms necessary to understand biomolecular deregulation related to a disease or condition.

Our group has developed methods for performing GAG glycomics, proteomics, glycoproteomics in both separate and integrated forms employing multiple experimental techniques such as in solution and on slide tissue digestion followed by liquid chromatography based-tandem mass spectrometry (LC-MS/MS). In this chapter, we provide a historical overview of these methods for glycomics, glycoproteomics, and proteomics of key ECM constituents, i.e., GAGs and PGs, as reported over the past decade (Bielik and Zaia 2010, 2011; Bowman and Zaia 2010; Gill et al. 2013; Hitchcock et al. 2008a, b; Huang et al. 2011; Khatri et al. 2014, 2016; Klein et al. 2018; Leymarie et al. 2012; Raghunathan et al. 2019b; Shao et al. 2013a, b; Shi et al. 2012; Staples et al. 2009, 2010; Turiak et al. 2014). We describe the experimental approaches and their optimization to achieve higher coverage and better quality data.

5.2 Glycosaminoglycan Analysis/GAG-omics

5.2.1 Overview of GAGs

The GAG classes include unsulfated hyaluronan (HA), and sulfated heparin/heparin sulfate (HS), chondroitin/dermatan sulfate (CS/DS), and keratan sulfate (KS) (Fig. 5.1a–d). HS and CS are unbranched polymers composed of ~20–200 repeating disaccharide units; N-acetylgalactosamine (GalNAc) or N-acetylglucosamine (GlcNAc) and uronic acid, e.g., glucuronate (GlcA) or iduronate (IdoA) attached to serine or threonine residue of core protein through a characteristic tetrasaccharide linker (Kjellen and Lindahl 1991). In contrast, HA is not covalently attached to a core protein and is not sulfated, but consists of repeating disaccharide units of GlcA and GlcNAc attached via alternating β1,3- and β1,4-glycosidic linkage (Sethi and Zaia 2017). KS is composed of repeating disaccharide units of Gal, and GlcNAc via alternating β1,4 and β1,3-glycosidic linkage. The KS GAG chain may be attached to the core protein in three ways: KSI, where the GAG is a sulfated lactosamine chain attached to an N-glycan, KSII where the sulfated lactosamine chains are attached to O-linked glycans on serine/threonine residues, and KSIII which the GAG chains are attached to the core protein through mannose-Ser linkage (Funderburgh 2000). Identified over 100 years ago, GAGs are found in mast cell granules, cell surfaces, basement membrane, and extracellular matrix (ECM) (Zaia 2008). The sulfated GAGs consist of repeating disaccharide units that become modified biosynthetically via a series of enzymatic events, including deacetylation, sulfation, and epimerization. Figure 5.1e shows CS and HS biosynthesis. These spatial and temporal variations in GAG structure give rise to context-specific interactions with protein partners, growth factors, receptors, ligands responsible for critical biological processes such as cell signaling, adhesion, and interaction in normal and pathological conditions. These modifications are also responsible for the heterogeneous, anionic, and complex nature of GAGs that make GAG analytically challenging to study (Zaia 2005).

Fig. 5.1
figure 1

Structure of key Glycosaminoglycans (GAGs). (a) Heparan sulfate (HS) disaccharide unit, HS chains and linker tetrasaccharide structure. (b) Chondroitin sulfate (CS) sulfate disaccharide unit, CS chains and linker tetrasaccharide structure. (c) Hyaluronic acid (HA) disaccharide unit. (d) Keratan sulfate (KS) disaccharide unit and chains (e), CS and HS biosynthesis (adapted from Lindahl et al. 2015). GlcA glucuronic acid, GalNAc N-acetylgalactosamine, GlcNAc N-acetylglucosamine, Ac acetyl, S03H sulfate

5.2.2 Analytical Challenges of GAG Analysis

Routinely, a GAG oligosaccharide is subjected to a series of chemical and enzymatic degradation steps and analyzed using chromatographic, electrophoretic, or mass spectrometric methods (Conrad 1997; Turnbull et al. 1999; Venkataraman et al. 1999; Zaia 2009). Mass spectrometry serves as an essential tool for structural analysis of GAGs with high sensitivity and versatility. Over the years, several MS-based methods for GAG-glycomics or GAG-omics have been reported, including matrix-assisted laser desorption ionization (MALDI)-MS, size exclusion chromatography (SEC)-MS, and ion-pair reversed-phase chromatography (IPRP)-MS, and HILIC-MS (Henriksen et al. 2004; Hitchcock et al. 2008b; Laremore and Linhardt 2007; Liu et al. 2019; Shao et al. 2013b; Venkataraman et al. 1999; Wang et al. 2012). Success in MS analysis of GAGs depends largely on the extraction and workup methods used. In particular, it is important to remove salts, contaminants, nucleic acids, or lipids that could interfere with further analysis (Zaia 2009). Other analytical challenges include problems with recovery of GAGs from liquid chromatography (LC) system as charged glycans may stick to the titanium containing metallic loops, filters or transfer lines. In addition, it is necessary to use mass spectrometer fragile ion tuning parameters to minimize the extent to which sulfated ions dissociate during desolvation and ion transfer prior to mass analysis (Staples and Zaia 2011). We and others have developed effective analytical techniques to overcome these challenges (Bodet et al. 2017; Henriksen et al. 2004; Hitchcock et al. 2008a; Laremore and Linhardt 2007; Liu et al. 2019; Shao et al. 2013a; Solakyildirim 2019; Staples et al. 2009, 2010; Wang et al. 2012).

5.2.3 GAG LC-MS/MS Analysis Using SEC and Amide-HILIC

In 2006, we demonstrated successful LC-MS/MS platform with a compatible extraction method for quantifying CS GAGs using a size exclusion column (SEC) with on-line MS detection (Hitchcock et al. 2006). Despite its robustness and reliability, SEC is a low-resolution technique. Thus, in 2008 we implemented LC-MS/MS platforms utilizing amide-hydrophilic interaction chromatography (HILIC) instead of SEC for analyzing CS GAGs from connective tissues. We were able to profile the GAG chain non-reducing end, the linker region, and Δ-unsaturated interior oligosaccharide domains of the CS chains. The GAGs were extracted from the core protein using sequential β-elimination, C-18 cleanup to remove hydrophobic molecules, and finally, anion exchange spin columns to remove cationic molecules. The eluted anionic GAG mixture was then partially depolymerized with chondroitinase enzymes, and further differentially stable isotope-labeled by reductive amination using 2-anthranilic acid—d0 and d4, and subjected to amide-HILIC on-line LC-MS/MS analysis (Hitchcock et al. 2008b). One limiting factor of using amide-HILIC LC-MS/MS was the stability of the spray interface as conventional silica sprayers clogged in negative mode, and thus, required time consuming and extensive optimization. This problem was solved by using a non-silica sprayer such as provided by the Agilent Chip Cube and the Advion NanoMate robot.

5.2.4 GAG CE-LIF Analysis

We then ventured into capillary electrophoresis (CE) coupled with laser-induced fluorescence (LIF) for GAG disaccharide compositional analysis, an essential step towards understanding the GAG structure-function relationship. We first reported a method that utilized capillary electrophoresis (CE) with laser-induced fluorescence (LIF) to analyze GAG disaccharides in the biological samples. This method made several improvements to existing methods including, optimization of reductive amination conditions, an increase in sensitivity by using cellulose cleanup for derivatization, and optimization of separation for reproducibility and robustness (Hitchcock et al. 2008a). CE has various benefits over other analytical methods, including high resolving power and separation efficiency for disaccharide structural isomers differing in sulfation position, economical (use of less buffer and sample), faster, automated and reproducible analysis, but has not been widely used mainly because of disaccharides recovery issues after derivatization workup. Our method eliminated noise background and improved quantification of biological samples by 100-fold, and thus, enabled disaccharide quantification of HS and CS GAGs from biologically relevant PGs and intact tissue samples. This method was not, however, compatible with on-line MS detection.

5.2.5 GAG HILIC-CHIP-MS Based Analysis

In order to improve the chromatographic resolution of the LC-MS method for GAGs, we utilized a novel chip-based amide-HILILC system for negative ion LC-MS/MS of partially depolymerized heparin/HS, and CS/DS GAGs. The chip-based trapping cartridges assisted in the removal of contaminating proteins, lipids, nucleic acids, and acidic non-GAG carbohydrates by focusing the analyte in the MS while allowing contaminants to flow through with minimum interaction with the stationary phase. We were able to achieve robust positioning of the spray needle and the analysis of GAGs isolated from complex biological and chemical samples (Staples et al. 2009). In this work, we noted that there was a physical limitation for analysis of highly sulfated (polar) GAG oligosaccharides such as HS dp10s that start to elute when the source voltage is not able to maintain the electrospray.

To overcome this problem, we optimized the novel amide-HILIC HPLC CHIP platform with an introduction of makeup flow (MUF) (Staples et al. 2010). The MUF chips allowed electrospray in high aqueous conditions during negative-ion mode LC-MS, thus, eliminating the need to raise spray voltages as aqueous content increased. We used this chip for analysis of highly modified GAG domains involved in various biological processes. We were able to analyze dp10-dp14 HS and dp14-18 heparin oligosaccharides, which was not possible with a standard amide-HILIC HPLC chips (Staples et al. 2010). The HILIC-Chip based platform was a unique platform for analysis of GAGs but the ions observed were low in charge, resulting in undesirable sulfate loss from precursor ion during collision-induced dissociation (CID). To overcome sulfate losses, we used metal cation adducts to stabilize sulfate groups or nonvolatile polar compounds such as sulfolane to supercharge proteins could be added. Thus, we utilized microfluidic novel pulsed makeup flow (MUF) HPLC-chips that enabled controlled application of additives during a given chromatographic window and thus, reduced the nonvolatile additive build up in the ion source. Using these chips, the tandem-MS of these supercharged precursor ions showed significant decrease in sulfate loss (Huang et al. 2011). We further worked to improve tandem mass spectrometry of GAGs by reducing sulfate loss and generating better product ion profiles (Bielik and Zaia 2011; Leymarie et al. 2012; Shi et al. 2012).

5.2.6 Tetraplex Stable Isotope-Coded Based Quantitative GAG Glycomics

We demonstrated an effective method for tetraplex stable isotope-labeled reductive amination tags for quantitative glycomics of chondroitin sulfate proteoglycans (CSPGs), pharmaceutical heparins, and N-glycans from glycoproteins subjected to an online LC-MS platform as well as tandem mass spectrometry which was used or comparison of isomeric glycan fine structures from various samples. This method provided not only a precise compositional profiling of GAGs but also fine structural compositions together with multiplexing benefits for high-throughput (Bowman and Zaia 2010).

5.2.7 GAG Disaccharide Analysis Using HILIC LC-MS

Using a single LC-MS platform to generate complete disaccharide profiles for GAG, we utilized HILIC-MS for quantification of both enzyme-derived and nitrous acid depolymerization products for structural analysis of HS and CS/DS GAGs (Gill et al. 2013). HILIC is one of the most widely used separation tools for glycans. It offers several advantages such as shorter sample preparation time, ultrafast analysis due to low column backpressure and improved MS sensitivity. HILIC with online ESI-MS has been used widely for the analysis of released glycans (Luo et al. 2009; Mauko et al. 2011; Ruhaak et al. 2008; Zauner et al. 2011), glycopeptides (Calvano et al. 2008; Wohlgemuth et al. 2009; Zauner et al. 2010), GAG oligosaccharides (Huang et al. 2011; Kailemia et al. 2014; Staples et al. 2010). For GAG disaccharide analysis the challenge with HILIC is to find mobile phase conditions that achieve efficient retention of disaccharides containing a range of 0–4 sulfate groups. Generally speaking, chromatographic resolution is best for 2.1 and 1.0 mm internal diameter columns. It is typically necessary to use a tandem MS step to differentiate isomeric disaccharides that co-elute using HILIC (Gill et al. 2013).

5.3 On-Slide Tissue Digestion Coupled with LC-MS/MS for Integrated Glycomics and Proteomics

We innovated a novel on slide digestion platform in our lab that utilized serial enzyme digestions from surfaces of fresh frozen or fixed tissue sections (Raghunathan et al. 2019b; Shao et al. 2013a; Turiak et al. 2014). To understand the biological roles played by GAGs and PGs expression during pathogenesis, it is crucial to detect and profile GAGs and proteins at the histological scale to minimize cell heterogeneity and potentially inform diagnosis and prognosis. This method provided a readout of HA, CS, HS GAG quantities, domain structures, and non-reducing end structures as well as N-glycans, and proteins using a simple workflow of application of enzyme and extraction of biomolecules with minimal need for workup (Fig. 5.2). The method was able to quantify different biomolecules and perform integrated omics for tissue volumes of 10 nL or greater, corresponding to a 1 μL droplet of enzyme solution applied to a 1 mm diameter target on a 10 μm thick tissue slide. Using this method allowed the staining of parallel sections or immunohistochemistry to guide the selection of the target area on an unstained tissue section. This method provides a targeted approach to analyze a specific tissue area, for example, tumor vs. non-tumor, myelin vs. non-myelin, etc., and uncover detailed structural profiles and establish a functional relationship to understand the disease or normal pathology.

Fig. 5.2
figure 2

Schematic representation of On slide tissue digestion workflow. Taken and modified from Raghunathan et al. (2019b)

Compared to in solution digestion (Ji et al. 2015; Wisniewski 2016), on slide digestion is more economical in terms of time required per sample. On slide digestion also requires less post-digestion cleanup prior to the LC-MS step. The LC step results in higher dynamic range of detection for GAGs and proteins than can be achieved using MALDI imaging mass spectrometry (IMS) (Raghunathan et al. 2019b; Shao et al. 2013a; Turiak et al. 2014). By contrast, MALDI-IMS has the advantage of higher tissue spatial resolution than the on slide digestion method (Drake et al. 2017, 2018a).

In 2013, we reported this method for comparative glycomics profiling of HS disaccharides from human astrocytoma, and glioblastoma tissues (Shao et al. 2013a). Later, in 2014, we modified the technique to include various compound classes GAGs, N-glycans, and proteins/peptides using the bovine cortex and mouse brain tissue sections (Turiak et al. 2014). The data from a small 1.5 mm diameter tissue spot was consistent with previously published bulk mouse, liver, and brain tissue demonstrating the power of our method. More recently, we reduced the number of processing steps by digesting HS disaccharides, and N-glycans together (Raghunathan et al. 2019b).

We have applied this state-of-the-art platform to understand various brain pathologies, including glioblastoma (Shao et al. 2013a), aging (Raghunathan et al. 2018), schizophrenia (unpublished), and Parkinson’s disease (unpublished), and have uncovered several dysregulated GAGs, ECM related proteins and pathways.

5.4 In Solution Tissue Digestion for Integrated Proteomics and Glycomics

In the past, we have performed in solution tissue digestions to characterize GAGs and proteins but not in a sequential and/or an integrated omics manner (Jacobsen et al. 2019; Shao et al. 2013b). Recently, we developed a streamlined serial in solution protocol to analyze GAGs and proteins from the brain or other tissues (Fig. 5.3) (manuscript submitted). Compared to our on-slide digestion protocol that provides a selection of target area on a tissue slides (Raghunathan et al. 2018, 2019b; Shao et al. 2013a; Turiak et al. 2014) and MALDI-imaging method for glycans that offers higher spatial resolution (Drake et al. 2017, 2018a, b), this method can be applied to free-floating or frozen tissues and provides a high depth of coverage. This platform is more rapid (time-effective) and efficient (single-pot) than the currently used parallel approach, i.e., a multi-pot simultaneous enzyme application method (Chen et al. 2017; Shao et al. 2013b; Turnbull et al. 2010). The removal of GAGs also facilitates protein identification of the remaining deglycosylated PGs with higher peptide-coverage using conventional-proteomics (Klein et al. 2018), compared to current studies achieving only low PG-coverage (Donovan et al. 2012; Hondius et al. 2016). The protocol follows a filter-aided sample preparation (FASP) type (Wiśniewski et al. 2009) serial in-solution digestion using molecular weight cut-off (MWCO) membrane filters as a reactor to digest glycosaminoglycan (GAG) classes, including HA, CS, and HS, and collect it as a flow-through, and finally collect proteins to perform trypsin digestion to generate peptides from tissue or cell lysates. We have applied this workflow to mouse brain tissue, and human healthy and Alzheimer’s brain tissue.

Fig. 5.3
figure 3

Schematic representation of In solution tissue digestion workflow

5.5 Deep Sequencing of Proteoglycans

The peptide sequence coverage for large and highly complex PGs containing a high degree of glycosylation arising from GAGs, N-glycans, and mucin O-glycans are poorly-annotated by conventional MS analysis. Thus, little is known about the role of site-specific glycosylation of PGs in normal and disease pathologies. We developed a workflow (Fig. 5.4) to improve sequence coverage and identification of glycosylated peptides in biologically relevant proteoglycans (PGs), including small leucine-rich proteoglycan (SLRP) decorin and three hyalectan proteoglycans: neurocan, brevican, and aggrecan necessary to understand their role in pathophysiology (Klein et al. 2018). Using the workflow, we were able to identify linker-glycosite (created by removal of GAGs that leaves a linker tetrasaccharide plus one disaccharide to the protein/peptide), and 3 N-glycosylation sites for decorin, densely glycosylated mucin like region in the extended domain for neurocan and brevican, and 50 linker-glycosites and mucin-type O-glycosites in the extended region and N-glycosites in the globular domains for Aggrecan, many of which were not previously identified or reported.

Fig. 5.4
figure 4

Schematic representation of the workflow for enrichment of proteoglycan linker-peptides taken from Klein et al. (2018)

5.6 Conclusions

Over the past decade, glycoscientists around the world have created a vast pool of knowledge, glyco-databases, and glyco-technologies to characterize and analyse glycans, to define their structural composition, and relate their biological functions. Mass spectrometry has played a major role in determining the structural compositions of various biomolecules, and multiple disciplines viz. genomics, transcriptomics, proteomics, glycomics, and glycoproteomics have been integrated omics to address biologically relevant questions for the understanding of the biological system. Towards this end, we have developed various platforms for integrated omics approach and gain insights into development, disease, therapy, and regenerative medicine.

At this time, there are effective analytical methods for the combined analysis of GAGs and proteins from tissue samples. These methods employ digestion steps prior to electrospray mass spectral analysis. The first system developed employed SEC-MS, a system that is extremely robust but of limited sensitivity. The use of HILIC-MS allows effective profiling of GAG oligosaccharide mixtures and is the preferred method for disaccharide analysis. That HILIC-MS can be reduced in scale allows it to be used for detection of GAGs released using on slide digestion. We have analyzed tissue cohorts of several dozen samples using this approach. In order to improve the depth and sensitivity of PG coverage, we optimized in solution enrichment and digestion protocols. Looking ahead, there is no barrier to quantitative profiling of GAGs and proteins from tissue. Analytical throughput would be improved by application of robotic automation. The use of robotics may also reduce the volume of tissue required for analysis.