Abstract
Background
Glycosylation is an enzyme-catalyzed post-translational modification that is distinct from glycation and is present on a majority of plasma proteins. N-glycosylation occurs on asparagine residues predominantly within canonical N-glycosylation motifs (Asn-X-Ser/Thr) although non-canonical N-glycosylation motifs Asn-X-Cys/Val have also been reported. Albumin is the most abundant protein in plasma whose glycation is well-studied in diabetes mellitus. However, albumin has long been considered a non-glycosylated protein due to absence of canonical motifs. Albumin contains two non-canonical N-glycosylation motifs, of which one was recently reported to be glycosylated.
Methods
We enriched abundant serum proteins to investigate their N-linked glycosylation followed by trypsin digestion and glycopeptide enrichment by size-exclusion or mixed-mode anion-exchange chromatography. Glycosylation at canonical as well as non-canonical sites was evaluated by liquid chromatography–tandem mass spectrometry (LC–MS/MS) of enriched glycopeptides. Deglycosylation analysis was performed to confirm N-linked glycosylation at non-canonical sites. Albumin-derived glycopeptides were fragmented by MS3 to confirm attached glycans. Parallel reaction monitoring was carried out on twenty additional samples to validate these findings. Bovine and rabbit albumin-derived glycopeptides were similarly analyzed by LC–MS/MS.
Results
Human albumin is N-glycosylated at two non-canonical sites, Asn68 and Asn123. N-glycopeptides were detected at both sites bearing four complex sialylated glycans and validated by MS3-based fragmentation and deglycosylation studies. Targeted mass spectrometry confirmed glycosylation in twenty additional donor samples. Finally, the highly conserved Asn123 in bovine and rabbit serum albumin was also found to be glycosylated.
Conclusions
Albumin is a glycoprotein with conserved N-linked glycosylation sites that could have potential clinical applications.
Similar content being viewed by others
Background
Glycosylation is the commonest post-translational modification (PTM) of proteins [1]. It is distinct from glycation, a non-enzymatic process of protein modification by the addition of sugars on a background of hyperglycemia. Glycation affects a number of plasma proteins including albumin, haptoglobin and fibrinogen and is associated with microvascular damage and organ dysfunction in advanced diabetes [2]. By contrast, glycosylation is an enzyme-catalyzed physiological process which occurs on specific amino acids and is essential for protein stability, folding and function [3]. N-linked glycosylation is the most complex form of protein glycosylation in humans, where oligosaccharide chains or glycans are covalently attached to proteins at asparagine (Asn) residues by an N-glycosidic bond [1]. Most secretory and plasma proteins are N-glycosylated at asparagines in a canonical motif in the primary amino acid sequence, Asn-X-Ser/Thr, where X is any amino acid except proline [4]. The hydroxyl group in the side chain of serine or threonine performs the hydrogen bond donor function that is necessary for the catalytic transfer of the N-glycan to asparagine [5]. However, the presence of this motif is not sufficient for, and does not always result in, glycosylation. It is estimated that only ~ 70% of such sites are glycosylated [4]. Further, N-glycosylation sites are occupied by glycans to different levels, defining glycosylation macroheterogeneity [6]. Besides the canonical motif, N-glycosylation occurs on asparagines within the non-canonical motif Asn-X-Cys of some proteins, with the sulfhydryl group of cysteine performing the hydrogen bond donor function. However, the sulfur on cysteine has less electronegativity than oxygen on the side chains of serine or threonine [7]. As a result, this motif is known to be glycosylated at low levels in several proteins including transferrin and von Willebrand Factor [8, 9]. Another non-canonical motif, Asn-X-Val, has been shown to be glycosylated to low levels in some proteins including alpha-1B-glycoprotein and apolipoprotein B-100 [10, 11].
Mass spectrometry (MS)-based analysis of deglycosylated peptides has historically played an important role in the identification of glycoproteins and their sites of N-glycosylation [12]. Advancements in MS technology over the past several years coupled with the development of appropriate database search tools have facilitated comprehensive glycopeptide profiling with identification of intact glycans and their sites of attachment [11]. We sought to deploy advanced MS methods to discover and characterize glycosylation events that might have been missed previously because of low abundance or because they occurred at non-canonical motifs. Among abundant plasma proteins, such motifs, i.e., Asn-X-Cys or Asn-X-Val are present in alpha-2-macroglobulin, alpha-1-acid glycoprotein 2, transferrin, immunoglobulin heavy chains, and albumin [13]. Albumin is the most abundant plasma protein and besides maintenance of colloidal osmotic pressure of plasma, it functions as a transporter, antioxidant and enzyme [14]. It has been considered a non-glycosylated protein because it does not contain a canonical motif in its amino acid sequence. However, asparagines at sites Asn68 and Asn123 are part of non-canonical N-glycosylation motifs Asn-X(Glu)-Val and Asn-X(Glu)-Cys, respectively [13]. We wondered if albumin is glycosylated at these sites at levels that might not be detected by traditional methods of glycoprotein analysis [15]. Recently, one of these sites, i.e., Asn68, was reported to be linked to two glycans (Hex5HexNAc4NeuAc2 and Hex5HexNAc4NeuAc1) based on MS/MS fragmentation data [10]. In our experience with the analysis of plasma and serum-derived glycopeptides enriched using alternate methods, we observe a greater degree of glycan microheterogeneity in glycopeptides derived from abundant plasma proteins [11]. We were intrigued if Asn68 is occupied by a larger glycan repertoire and if Asn123 is also glycosylated. Thus, we systematically investigated N-linked glycosylation of albumin in serum from volunteer donors using a multi-pronged approach.
Methods
Samples
Twenty-three serum samples used in this study were deidentified residual samples from volunteer donors (approved by Mayo Clinic IRB: 21-012890).
LC–MS/MS-based discovery analysis of serum-derived glycopeptides
Serum samples from volunteer donors were first enriched for 14 abundant serum proteins and digested with trypsin. Glycopeptides were enriched from the peptide mixture using either size exclusion chromatography or mixed-mode anion exchange cartridge (MAX), and analyzed by mass spectrometry (MS) in data dependent acquisition mode an Orbitrap Eclipse mass spectrometer (Thermo Fisher Scientific) [11, 16, 17]. Data was searched in pGlyco3 [18]. Commercial bovine (Thermo Scientific) and rabbit (Sigma) serum albumin were digested followed by glycopeptide enrichment using MAX. Details of sample preparation and MS analysis are provided in Additional file 1: Supplemental Methods.
Mapping N-glycosylation sites onto structure of albumin
The crystal structure of human albumin derived from pooled human plasma with the identifier 1AO6 [19] was obtained from the PDB [20] and visualized using PyMOL (v2.5.7) [21]. N-linked glycosylation site Asn68 was highlighted in red color. The structure was rotated by 90º to visualize the other glycosylation site, Asn123, which was also highlighted in red.
Deglycosylation analysis of serum glycoproteins
Glycopeptides from serum proteins enriched by MAX were treated overnight with PNGase F (N-Zyme Scientifics) in either 16O or 18O water (97% 18O enriched, Sigma) at 37 °C. Deglycosylated peptides were analyzed by MS in parallel reaction monitoring mode as described in the Additional file 1: Supplemental Methods. Spectral inspection and peak identification were done manually.
MS3 analysis of glycopeptides
Albumin was immunoprecipitated from pooled serum samples using anti-albumin antibody (Invitrogen) followed by trypsin digestion and MAX-enrichment of glycopeptides. Selected glycopeptides were analyzed in the MS3 mode on an Orbitrap Eclipse mass spectrometer. Precursor ions were detected in the Orbitrap at a resolution of 120,000 with a scan range of 800 to 1500 m/z. Precursor ions were selected and fragmented in the ion-trap using collision induced dissociation (CID). Fragment ions were detected in the ion-trap and selected fragment ions for each precursor were further fragmented using HCD. Data analysis and fragment annotation in MS2 and MS3 spectra was done manually. See Additional file 1: Supplemental Methods for details.
Targeted LC–MS/MS analysis
Glycopeptides derived from 20 volunteer donor serum samples were analyzed in targeted mode on an Orbitrap Exploris 480 mass spectrometer (Thermo Fisher Scientific) coupled with Ultimate 3000 liquid chromatography system. Inclusion list consisted of precursor ions for all the detected albumin glycopeptides. Data was analyzed using Skyline (v 22.2) [22]. Details are described in the Additional file 1: Supplemental Methods.
Results
We employed a rigorous multi-step LC–MS/MS approach to detect and confirm N-glycosylation at the two non-canonical sites of albumin along with attached glycans. First, we performed deep discovery analysis using donor serum samples to identify intact glycopeptides with sites Asn68 and Asn123. We then confirmed our findings using streamlined enrichment methods, targeted LC–MS/MS analysis of 18O-labeled deglycosylated peptides as well as MS3 analysis of intact glycopeptides. These findings were validated in serum samples from twenty additional donors by targeted glycopeptide detection. Further, we show that the highly conserved glycosylation motif at Asn123 is also glycosylated in bovine and rabbit serum albumin.
A novel N-linked glycosylation site on albumin
For initial discovery, we analyzed serum from three volunteer donors using previously described glycoproteomic profiling methods [11]. First, we reduced the complexity of the serum glycoproteome by enriching the most abundant serum proteins using the Human 14 Multiple Affinity Removal (MARS 14) column prior to trypsin digestion. Second, we enriched glycopeptides from peptide mixtures using size-exclusion chromatography (SEC). Eight fractions from SEC were analyzed using LC–MS/MS-based discovery pipeline [11] (Fig. 1A). The resulting data were searched using pGlyco3 for glycopeptide identification [18]. The search was performed against the UniProt human proteome database and the in-built human N-glycan database [13]. On average, 1933 glycopeptides were detected in the three samples. The most abundant glycopeptides were from abundant serum glycoproteins including haptoglobin, alpha-1-acid glycoprotein, immunoglobulin heavy chain and complement C3. These proteins accounted for > 80% of the glycopeptide precursor peak areas. N-glycopeptides from albumin were detected with glycosylation at both sites Asn68 (LVN68EVTEFAK) and Asn123 (QEPERN123ECFLQHK, which contains a missed tryptic cleavage site N-terminal to the site of glycosylation). To our knowledge, this is the first report of N-glycosylation at Asn123 of albumin. At both sites, complex sialylated N-glycans with the following compositions were identified: Hex5HexNAc4NeuAc2, Hex5HexNAc4NeuAc1, Hex5HexNAc4NeuAc2Fuc1 and Hex4HexNAc3NeuAc1 (Fig. 1B). Two of these glycans, Hex5HexNAc4NeuAc2Fuc1 and Hex4HexNAc3NeuAc1 have not been reported previously on Asn68. To our surprise, albumin-derived glycopeptides accounted for < 1% of the total intensity of glycopeptides derived from abundant serum proteins even though albumin is the most abundant serum protein. The relative contribution of individual glycoproteins enriched by MARS 14 to total glycopeptide intensity from these samples is shown in Fig. 1C. We were curious to observe the relationship between the abundance of these proteins and the abundance of corresponding glycopeptides. For comparison, we used protein-level data reported by Geyer et al., 2016, to plot the relative intensities of the same proteins from plasma samples [23]. As shown in Fig. 1C, though albumin accounted for 36% total peptide share among these proteins, it only contributed 1% of the glycopeptide signal. Because N-glycosylation occurs more commonly on exposed regions of proteins as compared to internal, more buried regions [24], we examined the location of both glycosylation sites in the three-dimensional structure of albumin. We visualized the crystal structure of albumin from Protein Data Bank and mapped the two N-glycosylation sites [20]. As shown in Fig. 1D, both Asn68 and Asn123 are located on the surface of the structure of albumin.
Next, we tested an alternate strategy for glycopeptide enrichment for analysis by single MS runs. Peptides from MARS 14-enriched proteins were subjected to glycopeptide enrichment using MAX [17]. LC–MS/MS analysis of enriched samples as a single fraction led to the identification of 409 glycopeptides in each sample on average. In this method also, the most abundant serum glycoproteins described above accounted for > 80% of the glycopeptide precursor peak areas. Glycosylation at both non-canonical glycosylation sites of albumin, i.e., Asn68 and Asn123 was also detected in all three samples following MAX-enrichment. However, both sites were detected with only two glycans (Hex5HexNAc4NeuAc2, Hex5HexNAc4NeuAc1) using this method (Additional file 2: Fig. S1) Glycopeptides detected from SEC- and MAX-enriched samples are listed in Additional file 3: Tables S1 and S2, respectively.
Relative abundance of N-glycans on Asn68 and Asn123
To determine the relative abundance of the glycopeptides identified from each site, we compared the peak intensity of precursor ions of the glycopeptides detected at each site in the SEC-based experiment. Glycopeptides with glycan compositions Hex5HexNAc4NeuAc2 and Hex5HexNAc4NeuAc1 were the most abundantly detected glycopeptides at both sites (Fig. 2A and B). MS/MS spectra were manually verified for evidence of oxonium ions including signature ions of sialic acid, peptide backbone ions with attached glycan fragments (Y ions) as well as fragments of the naked peptide (b and y ions) for all glycopeptides mapped to albumin. Annotated MS/MS spectra for glycopeptides from both sites are shown in Fig. 2C, D and (Additional file 2: Fig. S2A–F). These data confidently identify both Asn68 and Asn123 as N-glycosylation sites while also describing the microheterogeneity at each site.
Confirmation of N-linked glycosylation sites
Next, we sought to confirm N-glycosylation at sites Asn68 and Asn123 of albumin by analyzing enzymatically deglycosylated peptides. Serum proteins were digested using trypsin and glycopeptides were enriched using a MAX column. Glycopeptides were treated with PNGase F using either 16O or 18O-labeled water. Deglycosylated peptides were identified considering the mass shift expected after enzymatic removal of the N-glycan, which is accompanied by the conversion of asparagine (Asn) to aspartic acid (Asp) [25]. Deglycosylated Asn residues were identified with conversion to Asp showing a mass difference of 0.98 Da in case of 16O incorporation and 2.98 Da in case of 18O incorporation.
The non-glycosylated peptide with Asn68 (LVN68EVTEFAK) was identified with a charge state of + 2 with m/z of 575.31. Upon treatment with PNGase F in 16O water, we detected the deglycosylated form of the formerly N-glycosylated peptide with a mass shift of 0.98 Da or 0.5 m/z (LVD68EVTEFAK, m/z of 575.80, Fig. 3A). In samples treated with PNGase F in 18O-labeled water, we observed a mass shift of 2.98 Da or 1.5 m/z, corresponding to the deglycosylated peptide (LVD*68EVTEFAK, m/z of 576.81 m/z, Fig. 3B). The partial overlap of peaks from the 16O-labeled peptides with the 18O-labeled peptides is explained by the natural abundance of isotopes and purity of 18O-labeled water used [26] (Fig. 3B). This analysis demonstrates enzymatic deglycosylation of Asn68, conclusively showing albumin glycosylation at this site.
Similarly, we detected the non-glycosylated peptide containing Asn123 (QEPERN123ECFLQHK) with a charge state of +3 and m/z of 572.27. Upon treatment with PNGase F in 16O water, we identified the deglycosylated form of the peptide (QEPERD123ECFLQHK, m/z of 572.60 m/z) as depicted in Fig. 3C. With 18O incorporation, we observed the deamidated form QEPERD*123ECFLQHK at the m/z of 573.25 m/z (Fig. 3D). This confirms glycosylation at Asn123.
Confirmation of albumin glycopeptides by MS3 fragmentation
To further enhance the confidence in the identification of albumin-derived glycopeptides, we performed MS3 analysis using an Orbitrap Eclipse Tribrid mass spectrometer which incorporates a high-sensitivity ion-trap detector. Albumin was immunoprecipitated from pooled serum and glycopeptides were enriched by MAX. Precursor ions corresponding to four albumin-derived glycopeptides were isolated and fragmented using collision-induced dissociation (CID) followed by their detection in the ion-trap. At low collision energy, glycosidic bonds were expected to break forming ions consisting of the peptide backbone carrying glycan fragments (Y ions). Selected Y ions were fragmented at the MS3 level using higher-energy collisional dissociation (HCD) followed by detection in the ion-trap. MS3 fragmentation produced glycan oxonium ions confirming the presence of glycopeptides, as well as further fragments of the Y ions. The resulting spectra were manually inspected and annotated (Fig. 4).
The precursor ions selected included the two most abundant glycopeptides at each glycosylation site, i.e., LVN68EVTEFAK with Hex5HexNAc4NeuAc1 (m/z = 1021.7, charge state +3), LVN68EVTEFAK with Hex5HexNAc4NeuAc2 (m/z = 1118.8, charge state +3), QEPERN123ECFLQHK with Hex5HexNAc4NeuAc1 (m/z = 907.8, charge state +4), and QEPERN123ECFLQHK with Hex5HexNAc4NeuAc2 (m/z = 980.6, charge state +4). As expected, prominent product ions generated from low energy CID fragmentation at MS2 level were glycopeptide Y ions (Fig. 4). Notably, we also detected singly charged oxonium ions (albeit with lower intensity) at m/z values of 274.0 (NeuAc with water loss), 292.1 (NeuAc), 366.1 (HexNAc and Hex), and 657.2 (HexNAc, Hex, and NeuAc), further confirming the presence of glycopeptides (as depicted in Fig. 4). Subsequently, fragment Y ions for each precursor ion generated at the MS/MS level underwent further fragmentation via HCD, yielding diagnostic MS3 fragment ions. The ion series with the serial loss of single monosaccharide residues validated the glycan composition of these glycopeptides. Further, the glycan oxonium ions at the MS3 level were detected with higher intensities, confirming the presence of glycopeptides. Spectra resulting upon fragmentation of precursor ions with m/z of 1118.8 (charge state +3) and 980.6 (charge state +4) with peptide sequence and glycan composition mentioned above are shown in Fig. 4A and 4B respectively.
Albumin glycosylation in a larger cohort of volunteer donors
To assess if glycosylation of albumin is a general phenomenon and validate our findings, we analyzed serum samples from twenty volunteer donors by targeted MS. Eight albumin-derived glycopeptides identified in the discovery experiment were targeted, i.e., glycopeptides with sequences LVN68EVTEFAK and QEPERN123ECFLQHK, each bearing one of four glycans, Hex5HexNAc4NeuAc2, Hex5HexNAc4NeuAc1, Hex5HexNAc4NeuAc2Fuc1 and Hex4HexNAc3NeuAc1. MAX-enriched N-glycopeptides from serum proteins were analyzed by parallel reaction monitoring-mass spectrometry (PRM-MS). In all the twenty individuals that were tested, we detected glycosylation at both Asn68 and Asn123 of albumin. The heterogeneity in the overall glycopeptide complement detected among the individuals is shown in Table S3 (Additional file 3).
Albumin glycosylation in other species
Because albumin is a highly conserved protein, we were curious if its orthologs in other mammalian species are also glycosylated. Examining the amino acid sequences of albumin orthologs from cow, rabbit, dog and mouse revealed that only albumin from mouse has canonical Asn-X-Ser/Thr motifs, but without annotation for N-linked glycosylation on UniProt [13]. However, these orthologs have multiple non-canonical N-glycosylation motifs. Multiple sequence alignment showed that the non-canonical motif Asn123-Glu-Cys, is highly conserved, whereas site Asn68 is not an evolutionarily conserved glycosylation site or amino acid (Fig. 5A). Therefore, to test if this site is glycosylated in other species, we analyzed bovine serum albumin (BSA) and rabbit serum albumin, which are commonly used in molecular biology and MS applications. Commercially available BSA was digested using trypsin followed by MAX-based enrichment of glycopeptides followed by LC–MS/MS analysis for glycopeptide discovery. Database searching for glycopeptides was done using pGlyco3 with the UniProt bovine proteome database for peptide sequences. As bovine N-glycans are similar in composition to human N-glycans except for the presence of an additional sialic acid (N-glycolylneuraminic acid or NeuGc) which is also present in mouse, we used the in-built mouse N-glycan database for this search [27]. We detected BSA-derived glycopeptides with Asn123 glycosylated by three complex sialylated glycans, i.e., Hex5HexNAc4NeuGc1, Hex5HexNAc4NeuAc1 and Hex5HexNAc4NeuAc1NeuGc1 (Fig. 5, Additional file 2: Fig. S3A and S3B respectively). Interestingly, besides glycosylation at the conserved site Asn123, we also detected glycopeptides from BSA with glycosylation at Asn185 with two glycans, Hex5HexNAc4NeuAc2 and Hex5HexNAc4NeuAc1NeuGc1 (Additional file 2: Fig. S3C and S3D respectively). However, this non-canonical glycosylation site, which is in the motif Asn185-Gly-Val, is not conserved across the species listed above. Additionally, in a separate experiment performed identically but with commercially available rabbit serum albumin and searched against the rabbit proteome and mouse N-glycan database, the conserved non-canonical N-glycosylation site Asn123 was also detected with two complex sialylated N-glycans, i.e., Hex5HexNAc4NeuAc2 and Hex5HexNAc4NeuAc1 (Fig. 5C; Additional file 2: Fig. S3E respectively). Overall, these data provide evidence for the glycosylation of albumin at the conserved non-canonical N-glycosylation site orthologous to Asn123 of human albumin in two additional mammalian species. Glycopeptides detected in bovine and rabbit serum albumin are listed in Additional file 3: Tables S4 and S5 respectively.
Discussion
Although most abundant serum proteins are glycoproteins, albumin itself has been considered a notable exception until recently [10]. Through discovery analysis and rigorous testing using different enrichment strategies [11, 15] and high-resolution LC–MS/MS methods, we report a novel N-glycosylation site on albumin (Asn123) and expand the glycan heterogeneity on another site (Asn68). Effective enrichment strategies are key to MS-based identification of glycopeptides owing to glycan heterogeneity [15, 28]. In the discovery experiments, SEC, which is based on physical properties and used here as a method for simultaneous enrichment and fractionation, resulted in identification of three times more glycopeptides in comparison to the single MS runs after MAX-based enrichment. Albumin glycopeptides at sites Asn68 and Asn123 were identified by both methods. Interestingly, though albumin is the most abundant plasma protein, glycopeptides from albumin accounted for < 1% of identified glycopeptide precursor peak areas, indicating low site occupancy (Fig. 1C). This follows our expectation based on previous reports on other proteins that non-canonical N-glycosylation motifs have lower stoichiometry of glycosylation [8, 9]. We also show that Asn123, which occurs within a highly conserved Asn-Glu-Cys motif is also glycosylated in bovine and rabbit serum albumin. In the case of BSA, we detected two glycopeptides containing NeuGc, a sialic acid that is not present in humans because the gene encoding an essential synthetic enzyme, cytidine monophosphate-N-acetylneuraminic acid hydroxylase (CMAH), is inactive in humans [29]. Though BSA is routinely used as a tool for quality control for MS, we believe that its glycosylation has generally been missed previously because of the absence of suspicion owing to lack of a consensus N-glycosylation motif.
Physiologically, albumin is involved in several functions including binding and transportation of molecules such as fatty acids, hormones, drugs, vitamins and metal ions [30, 31]. These ligand-binding and antioxidant functions of albumin are influenced by its various post-translational modifications (PTMs) [30] including cysteinylation, oxidation and nitrosylation [31]. Additionally, glycation is present at 20–30% in circulating albumin in hyperglycemic individuals, and this modification alters its binding properties [2, 32]. Traditional methods of protein analysis, e.g., isoelectric focusing (IEF) and two-dimensional gel electrophoreses (2DE) did not raise any suspicions of glycosylation of albumin on record, even though some such studies report separation of albumin into fractions based on isoelectric point [33]. In light of the current report, we wonder if the smears and unexplained spots annotated for albumin on IEF and 2DE experiments may be explained, at least in part, by albumin N-glycoforms [33, 34]. Additional studies may determine functional effects of glycosylation on the ligand-binding and antioxidant properties of albumin, along with its susceptibility to undergo other PTMs [35]. For example, Cys125, which is the C-terminal amino acid in the motif that Asn123 is part of (Asn123-Glu124-Cys125), participates in the formation of a disulfide bridge in the secondary structure of albumin [36]. It has been previously shown that degree of glycosylation at sites in Asn-X-Cys motifs is likely related to the rate of translation as well as the rate of disulfide bond formation [7]. Hence, the rate of glycosylation at Asn123 may be altered in states such as liver disease and metabolic syndrome where liver function is affected [37].
Conclusions
To conclude, we report that albumin is a glycoprotein with multiple N-linked glycoforms at two non-canonical sites. As these findings are discordant with the long-held notion that albumin is a non-glycosylated protein, we confirmed them by multiple additional lines of investigation. Serum albumin level is used as a marker for several diseases including renal, hepatic and cardiovascular disorders [38]. Pathological modifications of albumin including glycation and cysteinylation are also associated with diabetes and liver disease [39]. In fact, glycated albumin has been shown to complement glycated hemoglobin as a marker of prediabetes [40]. Given this importance of albumin in clinical practice, glycosylated albumin could also have clinical significance. Indeed, we have recently found reduced levels of the glycopeptide bearing Hex5HexNAc4NeuAc1 at Asn123 in patients with a congenital disorder of glycosylation (CDG) [41]. This indicates that glycosylation events on albumin could potentially be of diagnostic or other clinical uses. Future studies may determine the exact role of glycosylation of albumin and how it is altered in other diseases associated with altered protein glycosylation. Our findings alter the prevailing paradigm by showing that albumin is not a non-glycosylated protein and may expand our understanding of its structure and function, and its clinical and biochemical applications.
Availability of data and materials
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE [43] partner repository with the dataset identifier PXD047863.
Abbreviations
- MARS 14:
-
Multiple affinity removal spin cartridge human-14
- SEC:
-
Size-exclusion chromatography
- MAX:
-
Mixed-mode anion exchange
- LC–MS/MS:
-
Liquid chromatography–tandem mass spectrometry
- PRM:
-
Parallel reaction mode
- Asnx :
-
Asparagine at amino acid site x in a polypeptide sequence
- PTM:
-
Post-translational modification
- MS:
-
Mass spectrometry
- TEABC:
-
Triethylammonium bicarbonate
- DTT:
-
Dithiothreitol
- IAA:
-
Iodoacetamide
- TFA:
-
Trifluoroacetic acid
- FA:
-
Formic acid
- ACN:
-
Acetonitrile
- DDA:
-
Data-dependent analysis
- AGC:
-
Automatic gain control
- HCD:
-
Higher-energy collisional dissociation
- FDR:
-
False-discovery rate
- MS/MS:
-
Tandem mass spectrometry
- BSA:
-
Bovine serum albumin
- PDB:
-
Protein Data Bank
- PBS:
-
Phosphate-buffered saline
- CID:
-
Collision-induced dissociation
- Hex:
-
Hexose
- HexNAc:
-
N-Acetylhexosamine
- NeuAc:
-
N-Acetylneuraminic acid
- Fuc:
-
Fucose
- NeuGc:
-
N-Glycolylneuraminic acid
- SNFG:
-
Graphical representations of glycans are made using Symbol Nomenclature for Glycans [42]
References
Spiro RG. Protein glycosylation: nature, distribution, enzymatic formation, and disease implications of glycopeptide bonds. Glycobiology. 2002;12(4):43R-56R.
Rondeau P, Bourdon E. The glycation of albumin: structural and functional impacts. Biochimie. 2011;93(4):645–58.
Schjoldager KT, Narimatsu Y, Joshi HJ, Clausen H. Global view of human protein glycosylation pathways and functions. Nat Rev Mol Cell Biol. 2020;21(12):729–49.
Stanley P, Moremen KW, Lewis NE, Taniguchi N, Aebi M. N-Glycans. In: Varki A, Cummings RD, Esko JD, Stanley P, Hart GW, Aebi M, et al., editors. Essentials of Glycobiology. 4th ed. Cold Spring Harbor (NY) 2022, 103–16.
Bause E, Legler G. The role of the hydroxy amino acid in the triplet sequence Asn-Xaa-Thr(Ser) for the N-glycosylation step during glycoprotein biosynthesis. Biochem J. 1981;195(3):639–44.
Hulsmeier AJ, Tobler M, Burda P, Hennet T. Glycosylation site occupancy in health, congenital disorder of glycosylation and fatty liver disease. Sci Rep. 2016;6:33927.
Lowenthal MS, Davis KS, Formolo T, Kilpatrick LE, Phinney KW. Identification of novel N-glycosylation sites at noncanonical protein consensus motifs. J Proteome Res. 2016;15(7):2087–101.
Canis K, McKinnon TA, Nowak A, Haslam SM, Panico M, Morris HR, et al. Mapping the N-glycome of human von Willebrand factor. Biochem J. 2012;447(2):217–28.
Satomi Y, Shimonishi Y, Takao T. N-glycosylation at Asn(491) in the Asn-Xaa-Cys motif of human transferrin. FEBS Lett. 2004;576(1–2):51–6.
Sun S, Hu Y, Jia L, Eshghi ST, Liu Y, Shah P, et al. Site-specific profiling of serum glycoproteins using N-linked glycan and glycosite analysis revealing atypical N-glycosylation sites on albumin and alpha-1B-glycoprotein. Anal Chem. 2018;90(10):6292–9.
Saraswat M, Garapati K, Mun DG, Pandey A. Extensive heterogeneity of glycopeptides in plasma revealed by deep glycoproteomic analysis using size-exclusion chromatography. Mol Omics. 2021;17(6):939–47.
Zielinska DF, Gnad F, Wisniewski JR, Mann M. Precision mapping of an in vivo N-glycoproteome reveals rigid topological and sequence constraints. Cell. 2010;141(5):897–907.
UniProt C. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 2021;49(D1):D480–9.
Quinlan GJ, Martin GS, Evans TW. Albumin: biochemical properties and therapeutic potential. Hepatology. 2005;41(6):1211–9.
Riley NM, Bertozzi CR, Pitteri SJ. A pragmatic guide to enrichment strategies for mass spectrometry-based glycoproteomics. Mol Cell Proteomics. 2021;20: 100029.
Budhraja R, Saraswat M, De Graef D, Ranatunga W, Ramarajan MG, Mousa J, et al. N-glycoproteomics reveals distinct glycosylation alterations in NGLY1-deficient patient-derived dermal fibroblasts. J Inherit Metab Dis. 2023;46(1):76–91.
Yang W, Shah P, Hu Y, Toghi Eshghi S, Sun S, Liu Y, et al. Comparison of enrichment methods for intact N- and O-linked glycopeptides using strong anion exchange and hydrophilic interaction liquid chromatography. Anal Chem. 2017;89(21):11193–7.
Zeng WF, Cao WQ, Liu MQ, He SM, Yang PY. Precise, fast and comprehensive analysis of intact glycopeptides and modified glycans with pGlyco3. Nat Methods. 2021;18(12):1515–23.
Sugio S, Kashima A, Mochizuki S, Noda M, Kobayashi K. Crystal structure of human serum albumin at 2.5 A resolution. Protein Eng. 1999;12(6):439–46.
Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, et al. The protein data bank. Acta Crystallogr D Biol Crystallogr. 2002;58(Pt 6 No 1):899–907.
Schrodinger, LLC. The PyMOL molecular graphics system, Version 1.8. 2015.
Pino LK, Searle BC, Bollinger JG, Nunn B, MacLean B, MacCoss MJ. The Skyline ecosystem: informatics for quantitative mass spectrometry proteomics. Mass Spectrom Rev. 2020;39(3):229–44.
Geyer PE, Kulak NA, Pichler G, Holdt LM, Teupser D, Mann M. Plasma proteome profiling to assess human health and disease. Cell Syst. 2016;2(3):185–95.
Petrescu AJ, Milac AL, Petrescu SM, Dwek RA, Wormald MR. Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding. Glycobiology. 2004;14(2):103–14.
Kuster B, Mann M. 18O-labeling of N-glycosylation sites to improve the identification of gel-separated glycoproteins using peptide mass mapping and database searching. Anal Chem. 1999;71(7):1431–40.
Kaji H, Saito H, Yamauchi Y, Shinkawa T, Taoka M, Hirabayashi J, et al. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins. Nat Biotechnol. 2003;21(6):667–72.
Nwosu CC, Aldredge DL, Lee H, Lerno LA, Zivkovic AM, German JB, et al. Comparison of the human and bovine milk N-glycome via high-performance microfluidic chip liquid chromatography and tandem mass spectrometry. J Proteome Res. 2012;11(5):2912–24.
Bagdonaite I, Malaker SA, Polasky DA, Riley NM, Schjoldager K, Vakhrushev SY, et al. Glycoproteomics. Nat Rev Methods Primers. 2022;2(1):48.
Chou HH, Takematsu H, Diaz S, Iber J, Nickerson E, Wright KL, et al. A mutation in human CMP-sialic acid hydroxylase occurred after the Homo-Pan divergence. Proc Natl Acad Sci USA. 1998;95(20):11751–6.
Fasano M, Curry S, Terreno E, Galliano M, Fanali G, Narciso P, et al. The extraordinary ligand binding properties of human serum albumin. IUBMB Life. 2005;57(12):787–96.
Rahali MA, Lakis R, Sauvage FL, Pinault E, Marquet P, Saint-Marcoux F, et al. Posttranslational-modifications of human-serum-albumin analysis by a top-down approach validated by a comprehensive bottom-up analysis. J Chromatogr B Analyt Technol Biomed Life Sci. 2023;1224: 123740.
Fanali G, di Masi A, Trezza V, Marino M, Fasano M, Ascenzi P. Human serum albumin: from bench to bedside. Mol Aspects Med. 2012;33(3):209–90.
Chromy BA, Gonzales AD, Perkins J, Choi MW, Corzett MH, Chang BC, et al. Proteomic analysis of human serum by two-dimensional differential gel electrophoresis after depletion of high-abundant proteins. J Proteome Res. 2004;3(6):1120–7.
Ong SE, Pandey A. An evaluation of the use of two-dimensional gel electrophoresis in proteomics. Biomol Eng. 2001;18(5):195–205.
Zacchi LF, Schulz BL. N-glycoprotein macroheterogeneity: biological implications and proteomic characterization. Glycoconj J. 2016;33(3):359–76.
Bocedi A, Cattani G, Stella L, Massoud R, Ricci G. Thiol disulfide exchange reactions in human serum albumin: the apparent paradox of the redox transitions of Cys(34). FEBS J. 2018;285(17):3225–37.
Levitt DG, Levitt MD. Human serum albumin homeostasis: a new look at the roles of synthesis, catabolism, renal and gastrointestinal excretion, and the clinical value of serum albumin measurements. Int J Gen Med. 2016;9:229–55.
Ballmer PE. Causes and mechanisms of hypoalbuminaemia. Clin Nutr. 2001;20(3):271–3.
Domenicali M, Baldassarre M, Giannone FA, Naldi M, Mastroroberto M, Biselli M, et al. Posttranscriptional changes of serum albumin: clinical and prognostic significance in hospitalized patients with cirrhosis. Hepatology. 2014;60(6):1851–60.
Sumner AE, Duong MT, Bingham BA, Aldana PC, Ricks M, Mabundo LS, et al. Glycated albumin identifies prediabetes not detected by hemoglobin A1c: the Africans in America Study. Clin Chem. 2016;62(11):1524–32.
Garapati K, Budhraja R, Saraswat M, Kim J, Joshi N, Sachdeva GS, et al. A complement C4-derived glycopeptide as a biomarker for PMM2-CDG. JCI Insight. 2024;In Press.
Neelamegham S, Aoki-Kinoshita K, Bolton E, Frank M, Lisacek F, Lutteke T, et al. Updates to the symbol nomenclature for Glycans guidelines. Glycobiology. 2019;29(9):620–4.
Perez-Riverol Y, Bai J, Bandla C, Garcia-Seisdedos D, Hewapathirana S, Kamatchinathan S, et al. The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. Nucleic Acids Res. 2022;50(D1):D543–52.
Acknowledgements
We thank Kiran B. Gaikwad for help with structural analysis and Richard K. Kandasamy for helpful discussions.
Funding
We thank Mayo Clinic DERIVE Office and Mayo Clinic Center for Biomedical Discovery for financial support and a grant from DBT/Wellcome Trust India Alliance entitled “Center for Rare Disease Diagnosis, Research, and Training” (IA/CRC/20/1/600002) to AP.
Author information
Authors and Affiliations
Contributions
KG, AJ and AP conceived and designed the study. KG and AJ performed experiments. KG, AJ, BJM, DGM, and RB analyzed the data. KG, AJ, JS and AP wrote the manuscript. KG, AJ and JS made the figures. All authors read and reviewed the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Serum samples used in this study were deidentified residual samples from volunteer donors collected with consent (approved by Mayo Clinic IRB: 21-012890).
Consent for publication
Not applicable.
Competing interests
All authors declare no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1: Supplemental Methods.
Additional experimental details, materials and methods.
Additional file 2: Additional figures.
Additional supporting figures providing additional information on the glycopeptides identified by discovery analysis of human serum, bovine serum albumin and rabbit serum albumin samples.
Additional file 3: Additional Tables.
Additional information on glycopeptides identified by SEC- and MAX-based enrichment; albumin-derived glycopeptides identified in additional volunteer donor samples; glycopeptides identified from bovine and rabbit serum albumin samples.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Garapati, K., Jain, A., Madden, B.J. et al. Defining albumin as a glycoprotein with multiple N-linked glycosylation sites. J Transl Med 22, 454 (2024). https://doi.org/10.1186/s12967-024-05000-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12967-024-05000-5