Introduction

Glycosylation is a very important post-translational modification, which occurs in approximately half of all mammalian proteins [1]. Glycoproteins is involved in a number of physiological processes, including protein folding and trafficking [2], immune response [3] and cell-cell and cell-matrix interaction [4, 5]. In cancer, there is increasing evidence pertaining to the role of glycosylation in tumour formation and metastasis [68]. It was demonstrated that glycoproteins have aberrant glycosylation patterns in malignancy, producing differential occupancy of glycosylation sites or variability in attached glycan structures [911].

Recent developments in MS-based technologies have enabled large-scale characterization of glycoproteins [12, 13]. Generally, glycoproteomic strategies rely on the analysis of released glycans or deglycosylated peptides [1416]. Although the removal of glycans simplifies the identification of glycosylation sites by LC–MS/MS, limited information correlating glycan structural heterogeneity to the protein attachment site are obtained.

Recently, novel strategies were developed to enrich and analyze intact N and O-linked glycopeptides, for example, using metabolic labeling of the glycoproteome combined with chemical enrichment using an isotopic recoding affinity probe [17], or chemoenzymatic method for glycopeptide enrichment on solid phase [18]. Alternatively, several MS dissociation methods have been applied to study intact glycopeptides such as collision induced dissociation (CID), high-energy collision dissociation (HCD) and electron capture/transfer dissociation (ECD/ETD) [1921]. Despite the recent advances, the detection of intact glycopeptides often requires increased sample quantities and an enrichment step, especially because of the glycan heterogeneity and the low ionization efficiency of glycoconjugates [22]. Besides, the computational identification via database searching are still challenging due to the heterogeneity and nontemplated nature of glycopeptides. Collectively, these challenges have hindered widespread use of intact glycopeptides analysis in the glycoproteome field.

Hydrophilic interaction chromatography (HILIC) has been used increasingly to separate and enrich glycopeptides and shows some advantages over other enrichment techniques such as broad specificity towards different glycan structures, good solubility of polar samples and greater compatibility with electrospray MS [12]. Hagglund et al. explored the use of HILIC followed by partial deglycosylation to identify N-glycosylation sites in complex biological samples [23] and Mysling improved the HILIC-glycopeptides enrichment efficiency by modifying the TFA-containing mobile phases [24]. This approach was also applied in the study of intact glycopeptides and glycoproteins in different biological matrices [21, 2530].

Urine is a rich source for glycoproteins derived from renal- [31] and distal organs [32], and has been considered as an attractive biologic fluid for disease biomarkers discovery, since it can easily be collected continuously and noninvasively [33, 34]. Although urinary proteomics approaches have been applied in biomarker discovery studies of several disorders [32, 3537], only a few studies aimed at identifying urinary glycoproteins [38, 39]. Moreover, information about site-specific glycan-peptide micro-heterogeneity is still not well explored.

Considering the urinary glycoproteomic studies that performed site-specific glycosylation analyses, Saraswat et al. [40] reported 51 N-glycosylation sites belonging to 37 glycoproteins in urinary exosomes and Halim et al. [41] showed 58 N- and 63 O-linked glycopeptides from 53 glycoproteins in the urine from one healthy person.

This study presents a simple and straightforward methodology without extensive sample preparation processes and pre-fractionation, to comprehensively characterize N-linked glycosylation sites of urinary glycoproteins as well as low molecular weight (<10 kDa) endogenous glycopeptides. Besides, by analyzing intact glycopeptides, high confident site-specific characterization of the N-linked glycans microheterogeneity was reported.

Material and methods

Urine sample collection and protein and peptide extraction

Twenty ml of urine (first morning void) was collected from a healthy male consenting individual in two different days and stored on ice prior to processing. The healthy volunteer was 35 years old, had no medical history, and did not take any prescription medicines. The samples were centrifuged at 3000 g for 10 min at 4 °C. The cleared supernatant was further filtered using 0.22 μm filter (Millipore, Billerica, MA). Twenty milliliters of filtered urine was concentrated using 10 kDa cutoff filters (Millipore, Billerica, MA). The concentrated urine, that is, the retentate, was stored immediately at −20 °C until further use. The fraction which passed through the 10 kDa cutoff filter was also collected and the peptides (<10 kDa) were extracted and concentrated using hydrophilic − lipophilic-balanced (HLB) solid phase (SPE; Waters).

Protein digestion and desalting

The proteins were reduced by addition of DTT to a final concentration of 10 mM and incubation for 30 min. The proteins were alkylated prior to digestion by the addition of IAA to a final concentration of 40 mM and incubation for 30 min in the dark at room temperature. To quench the reaction, DTT was added to a final concentration of 5 mM. Trypsin (1:50, w/w) was added, and the mixture incubated overnight at 37 °C. The reaction was stopped with 0.4 % TFA and desalted with hydrophilic − lipophilic-balanced (HLB) solid phase extraction (SPE; Waters).

Glycopeptide enrichment using HILIC

The samples were reconstituted in 100 μl of 80 % (v/v) acetonitrile, 10 % dimethyl sulfoxide (DMSO) and 1 % (v/v) trifluoroacetic acid. Peptides were loaded onto an in-house PolyHYDROXYETHYL A™ HILIC resin (PolyLC Inc) packed onto a C8 disk (Empore) in a p200 pipette tip. The flow-through and wash fraction (80 % (v/v) acetonitrile and 1 % (v/v) trifluoroacetic acid) were collected and analyzed by LC-MS/MS. The enriched glycopeptides were eluted with 50 μL 0.1 % (v/v) TFA followed by 50 μL of 25 mM NH4HCO3 and finally 50 μL of 50 % (v/v) acetonitrile. The three eluted fractions were combined and dried by vacuum centrifugation.

PNGase F deglycosylation

An aliquot of the HILIC-enriched glycopeptides (from protein trypsin digestion and endogenous glycopeptides) was resuspended in 50 mM ammonium bicarbonate, pH 8.0 and deglycosylated with 500 U of PNGase F (New England Biolabs, Ipswich, MA) for 12 h at 37 °C. After incubation, the peptides were dried by vacuum centrifugation and reconstituted in 50 μl of 0.1 % formic acid prior to LC-MS/MS analysis.

Mass spectrometry analysis

The resulting peptide mixture was analyzed on a LTQ Velos Orbitrap mass spectrometer (Thermo Fisher Scientific) coupled with LC-MS/MS by an EASY-nLC system (Thermo Fisher Scientific) through a nanoelectrospray ion source. Sample concentration and desalting were performed online using a pre-column (2 cm; 100 μm ID; 5 μm C18-A1; Thermo). Separation was accomplished on Acclaim™ PepMap™ 100 C18 column (10 cm; 75um ID; 3um C18-A2; Thermo) using a linear gradient of A and B buffers (buffer A: A = 0.1 % formic acid; Buffer B = 95 % ACN, 0.1 % formic acid) from 1 % to 50 % buffer B over 60 min at a flow rate of 0.3 μL/min to elute peptides into the mass spectrometer. Columns were washed and re-equilibrated between LC–MS/MS experiments. Mass spectra were acquired in the positive-ion mode over the range m/z 400–1200 at a resolution of 30,000 (full width at half-maximum at m/z 400) and AGC target >1 × e 6. The 20 most intense peptide ions with charge states ≥2 were sequentially isolated to a target value of 15,000 and isolation width of 2 and fragmented in the linear ion trap using low-energy CID (normalized collision energy of 35 %) with activation time of 10 ms. For intact glycopeptides analysis, the 7 most intense peptide ions with charge states ≥2 were sequentially isolated to a target value of 15,000 and isolation width of 2 and fragmented in the linear ion trap using high-energy HCD (normalized collision energy of 35 %) with activation time of 0.1 ms. Dynamic exclusion was enabled with an exclusion size list of 500, exclusion duration of 15 s, and a repeat count of 1.

Analysis of intact N-glycopeptides

Intact glycopeptide spectra from each sample were searched against database using Byonic software v2.6.46 (Protein Metrics, http://www.proteinmetrics.com/) [42]. Searches were performed with the following fixed modifications: precursor mass tolerance of 10 ppm, product ion mass tolerance of 0.02 Da, carbamidomethylation of Cys, and fully trypsin specific cleavage with a maximum of two missed cleavages. Searches were also conducted with the following variable modifications: oxidation of methionine (15.994 Da), deamidation NQ (+0.984) and glycosylation of Asn with N-glycan database (309 mammalian no sodium) available within Byonic. The list of glycoproteins identified in the formerly N-linked peptides generated after PNGase F treatment fraction were used as protein database. All searches were filtered to <1 % false discovery rate (FDR) at protein level and 0 % at peptide level [43]. Moreover Byonic software calculates two-dimensional posterior error probability (PEP 2D), meaning that the protein identity (dimension one) and quality of peptide-spectrum match are taken into account for a correct sequence assignment. The probabilities are computed by the usual target/decoy method with the decoys being reversed protein sequences [44]. Extracted ion chromatograms using the HexNAc and NeuAc oxonium ions were performed using the Xcalibur software (Thermo). Manual inspection was performed in the MS/MS spectra of all intact glycopeptides identified.

Protein identification and glycosylation site analysis

For protein identification and glycosylation site analysis, raw files were imported into MaxQuant version 1.5.2.8 [45] and the database search engine Andromeda [46] was used to search MS/MS spectra against a database composed of the Uniprot Human Protein Database (release April 15, 2015; 45,185 entries) with a tolerance level of 4.5 ppm for MS and 20 ppm and 0.5 Da for MS/MS in HCD and CID mode, respectively. Enzyme specificity was set to trypsin with a maximum of two missed cleavages. For the database search of N-formerly endogenous glycopeptides no enzyme was selected and for the N-formerly endogenous glycopeptides treated with trypsin, semi-specific free N-terminus digestion with a maximum of two missed cleavages was considered. Carbamidomethylation of cysteine (57.021 Da) was set as a fixed modification, and oxidation of methionine (15.994 Da), deamidation NQ (+0.984 Da) and protein N-terminal acetylation (42.010 Da) were selected as variable modifications. All identifications were filtered in order to achieve a FDR of 1 %.

After excluding peptides identified as potential contaminants or in reverse database, we manually filtered the peptides containing the deamidation sites within the glycosylation motif NxS/T/C. Motif-X algorithm was used to identify sequence motifs from the endogenous N-glycopeptides, using the ipi.HUMAN.fasta database as background. Only motifs with p < 10E6 were considered.

In order to evaluate the chemical (spontaneous) deamidation rate, we also performed database search using MaxQuant in the HILIC-enriched glycopeptide fraction without treatment with PNGase F (intact glycopeptides fraction) [47].

To determine biological processes and tissue protein localization statistically over-represented in a given protein list InnateDB [48] and Enrichr [49] software were used.

MotifX (http://motif-x.med.harvard.edu/) and Sequence2Logo (http://www.cbs.dtu.dk/biotools/Seq2Logo/) were used to create the sequence motifs for endogenous glycopeptides. For MotifX analysis, peptide sequences were centered at the glycosylation site. For Sequence2Logo analysis, peptide sequences with six aminoacids were loaded.

Results and Discussion

Herein, we employed a simple and straightforward method to characterize the N-linked glycosylation sites of glycoproteins and endogenous glycopeptides in human urine. For that, human urine was collected and concentrated using ultrafiltration. After protein digestion, glycopeptides were enriched using HILIC and N-glycans were removed by PNGase F treatment. The filtrate (peptides <10 kDa) was subjected to solid phase extraction, followed by HILIC enrichment and deglycosylation with PNGase F. Intact glycopeptides were also analyzed in order to site-specifically characterize the N-linked glycosylation heterogeneity. The glycoproteomic workflow is summarized in Fig. 1.

Fig. 1
figure 1

Schematic workflow for preparation of urinary proteins and endogenous peptides for site-specific N-linked glycosylation analysis. The glycopeptides were enriched using polyLC-HILIC affinity chromatography. The glycans were released through treatment with PNGase F and the retained glycopeptide fractions as well as the non-modified peptides (flow through) were analyzed by LC-MS/MS employing CID fragmentation. The retained glycopeptides fraction was also directly analyzed by LC-MS/MS employing HCD fragmentation to site-specific characterization of the glycan microheterogeneity

Overview of the urinary glycoproteome

Considering the formerly N-linked peptides generated after PNGase F treatment, 1092 redundant and 472 unique peptides were identified within the NxS/T/C motif (where X is not proline), covering 532 unique glycosylation sites from 256 glycoproteins (Fig. 2a and Supplemental Table S1 and S2). Considering the unmodified peptides or the peptides with deamidation modification in N or Q outside the NxS/T/C motif, we identified 422 redundant peptides. Therefore, using this strategy we observed a percentage of glycopeptides enrichment of about 72 % (1092/1514).Using the fraction of HILIC-enriched peptides without PNGase F treatment, we searched for peptides bearing a spontaneously deamidated asparagine within the N-linked glycosylation motif, however no peptides were found following this criteria.

Fig. 2
figure 2

Overview of the urinary proteome and glycoproteome. a Number of glycoproteins identified considering the formerly N-linked glycosylated peptides generated after PNGase F treatment (glycoproteome); number of proteins identified in the non-modified peptide fraction that did not bind to HILIC (proteome); number of total proteins identified the human urine considering the glycoproteome and proteome. The two biological replicates (urine collected at separate points over two different day) were considered to the total number of proteins identified in each fraction. b Distribution of N-glycoproteins (glycoproteome) identified in the urine collected at separate points over two different days. c Distribution of proteins (proteome) identified in the urine collected at separate points over two different days. d Overlap between the proteins identified in the glycoproteome and proteome fraction. e Percentage of proteins with signal peptides and transmembrane domains, considering the proteins identified in the glycoproteome or proteome fraction. f Venn diagram of the comparisons of the observed glycoproteome with those published by Saraswat et al. [40] and Halim et al. [41]

The overlap of glycoproteins identified in two different days of urine collection was about 49 % (126 glycoproteins) (Fig. 2b). The non-modified peptides that did not bind to HILIC (considered here as the urine proteome) were also analyzed by LC-MS/MS, resulting in 391 proteins identified with 57 % overlap (223 proteins) between the two different days (Fig. 2c and Supplemental Table S3). In total, 562 proteins were identified (Fig. 2a); however, the glycoproteome fraction brought out 171 exclusive proteins (Fig. 2d). A higher proportion of proteins displaying transmembrane regions (36 %) and signal peptides (70 %) were observed in the glycoproteome-enriched fraction compared to the proteome fraction (24 % and 50 %, respectively) (Fig. 2e and Supplemental Table S2 and S3); thus secreted and cell surface proteins can be more efficiently identified using glycopeptide enrichment strategies. The top ten most significant biological processes overrepresented in the glycoproteome database were: cell adhesion, negative regulation of endopeptidase activity, extracellular matrix organization, angiogenesis, platelet degranulation, regulation of proteolysis, receptor-mediated endocytosis, platelet activation, blood coagulation and heterophilic cell-cell adhesion (Supplemental Table S4). To assess the localization and distribution of the glycoproteins in human tissues we performed enrichment analysis using Human Gene Atlas [50] and Human Proteome Map [51] data-set libraries available within Enrichr software. The glycoproteins identified in the urine of a male individual had a significant enrichment localization relative to the total number of genes (Human Gene Atlas) or proteins (Human Proteome map) identified in each tissue, for liver (p = 0.004, gene names: TF, BTD, AMBP, APOH, LCAT, HRG, EPHA1, KNG1, CPN2), adipocyte (p = 0.01, gene names: LRP1, FZD4, MXRA8, TIMP1, FSTL1), smooth muscle (p = 0.04, gene names: MRC2, HEG1, QSOX1, TIMP1, FSTL1, FBN1), kidney (p = 0.03, gene names: CUBN, EGF, DPEP1, FOLR1) (Humana Gene Atlas, Supplemental Fig. 1a) and adult prostate tissue (p = 0.0004, gene names: CDH2, KLK1, CADM2, AXL, QPCT, DPEP1, SERPINI1, CPZ, NEO1, PVRL1, KLK11) (Human Proteome Map, Supplemental Fig. 1b). These results suggest that the urinary glycoproteome may be a good biofluid to study disorders related not only to urological disease but also other tissues.

The list of glycoproteins identified in our study covered 17 out of 37 (46 %) glycoproteins identified in human urinary exosomes [40] and 16 out of 53 (25 %) glycoproteins identified in the previous healthy urinary glycoproteomic study [41]. However our study showed 231 exclusive glycoproteins that were not identified in the previous urinary glycoproteome studies (Fig. 2f). The modest overlap of glycoproteins identified in our study comparing with these two other studies were expected since biological as well methodological variation may produce different sets of enriched glycoproteins/glycopeptides. For example, the study by Halim et al., used hydrazine capture to enrich sialylated glycoproteins in human urine and the intact glycopeptides were analysis by CID and ECD fragmentation method. In the study by Saraswat et al., they focused on the urinary exosomes and the glycopeptides were enriched using SNA affinity and analyzed by CID-tandem MS. However, despite all these variation we could still observe common glycoproteins with these studies and demonstrate that our approach brings exclusive identifications.

Intact glycopeptides analysis

Site-specific glycan micro-heterogeneity of the urinary glycoproteins was investigated by analyzing an aliquot of the HILIC-enriched glycopeptides without prior PNGase F treatment, using high resolution LC-MS/MS and higher energy collision dissociation (HCD) (Fig. 1). Using the intense diagnostic saccharide oxonium ions such as m/z 138.06/204.09 for N-acetylhexosamine (HexNAc) and/or m/z 274.09/292.10 for N-acetylneuraminic acid (NeuAc), we compared the glycosylation profile of intact glycopeptides expressed in the human urine collected in two different days (Fig. 3a).

Fig. 3
figure 3

Investigation of the site-specific N-linked glycans of the human urinary glycoproteins. a Extracted ion chromatogram of HexNAc (m/z 138.06/204.09) and NeuAc (m/z 274.09 and 292.10) oxonium ions with a mass tolerance of 20 ppm. b Distribution of unique N-linked intact glycopeptides in the urine collected in two different days. c Distribution of glycoproteins covered by the identified intact glycopeptides in the urine collected in two different days. d Distribution of the number of intact glycopeptides identified according to the class (high mannose or complex/hybrid) and features

The extracted ion chromatogram for the HexNAc and NeuAc oxonium ions showed similar profile between the two different days (Fig. 3a). Interestingly, the diagnostic ion at m/z 274.091/292.10 pointed out to large amount of glycopeptides containing NeuAc in human urine.

To assess site-specific glycan composition, intact glycopeptides were identified using Byonic software. 1437 and 1038 spectra were assigned to a glycopeptide, in the first and second day, respectively (data not shown). Considering the non-glycosylated peptides, 172 and 91 were identified in the first and second day, respectively, corresponding to a percentage of glycopeptide enrichment in almost 90 %. However, for high confident identification, we filtered only the glycopeptides identified with 0 % FDR, resulting in 156 and 402 identifications in the first and second day, respectively (Supplemental Table S5). From the 156 redundant intact glycopeptides identified with 0 % FDR in the first day, all of them had PEP 2D <0.001 (less than 0.1 % chance that the match is wrong). In the second day, from 402 redundant intact glycopeptides identified with 0 % FDR, 257 had PEP 2D <0.001 and 142 had PEP 2D <0.01 (less than 1 % chance of a wrong match) and only 3 had PEP 2D <0.1 (less than 10 % chance of a wrong match). The glycopeptides identified with the lowest PEP score had 14 fragments assigned (not shown). These parameters were chosen for stringent and confident assignments [17, 52]. However, more efforts are needed to establish a method for intact glycopeptide FDR calculation.

In total, 303 unique intact glycopeptides were identified, with an overlap of 20 % between the first and second day (Fig. 3b). These glycopeptides covered a total of 60 glycoproteins, of which 30 (50 %) were identified in both days (Fig. 3c). Unfortunately, the FDR and scores only assesses the correctness of the peptide sequence [44], which means that the glycopeptide identification can be partially correct, e.g. correct peptide sequence and glycosylation site but still remain incorrect at the identified monosaccharide composition. For example, we observed that from the 303 unique intact glycopeptides, 61 showed the glycan NeuGc, which is uncommon in human. We manually inspect all MS/MS spectra to search for its oxonium ions at m/z 308/290 however we did not find these ions but the NeuAc oxonium ion instead (some examples are provided in Supplemental Fig. 3). Since the mass difference between NeuAc and NeuGc is 16 Da, we checked if the Y1 ion of these peptides also contained the addition of +16 Da. Interestingly, from the 43 peptide sequences assigned with NeuGc, we were able to find in 22 of them the Y1 + 16 in the MS/MS spectra. Therefore, the peptides identified with NeuGc have a modification of +16 Da in the peptide sequence (Supplemental Fig. 2 A-U).

Most of the glycan structures were classified as hybrid/complex; amongst them 110 glycopeptides were identified with fucose and sialic acid (36 %), 78 with only fucose (25 %), 49 with only sialic acid (16 %) and 30 (10 %) composed by only HexNAc and Hex. High mannose was observed in 35 glycopeptides (11 %) (Fig. 3d and Supplemental Table S5). Interestingly, Gao et al. showed very similar distribution of N-glycans composition in human serum, where complex/hybrid glycans containing fucose and sialic acid were the main composition, followed by complex/hybrid with only sialic acid or only fucose, undecorated complex/hybrid and in less proportion the high mannose [53]. We also performed charge state distribution analysis of the intact glycopeptides identified in each class. We observed that glycopeptides containing NeuAc presented the highest percentage of >4 charges, while high mannose glycopeptides were identified mainly with +2 and +3 charges (Supplemental Fig. 3).

The 303 unique intact glycopeptides covered to 104 unique peptide sequences. These sequences had an overlap of 58 % (60 peptides) with the peptides identified in the deglycosylated fraction, in which the peptide harbored the deamidation site in the consensus NXS/T/C motif (where X is not proline) (Supplemental Fig. 4). Using UnicarbKB database, we manually checked the possible structures for each glycan assigned to the 186 intact glycopeptides, whose peptide sequences were also identified in the formerly N-linked peptide fraction (Supplemental Table S6). From 57 glycan unique structures, 48 were found in the UnicarbKB database [54] (http://www.unicarbkb.org/) (Table 1). The glycopeptides showed in Table 1 were considered here as high confident identified glycopeptides in human urine, since they were evidenced by two different MS-based approaches and had the glycan structure annotated in a public database. Examples of annotated spectra of N-linked glycopeptides, directly from the Xcalibur program, with typical oxonium ions, glycosidic fragments and b/y-ions, are provided in Supplemental Fig. 5a-d.

Table 1 N-linked intact glycopeptides identified in human urinary glycoproteins with the possible glycan structure selected from UnicarbKB database

N-linked glycosylation sites analysis of urinary endogenous peptides

Low molecular weight glycopeptides that were not retained by the 10 kDa-cutoff membrane was extracted and enriched using HILIC. The formerly N-linked peptides generated after PNGase F treatment were analyzed by LC-MS/MS. We identified 125 glycosylated peptides covering 70 unique glycosylation sites within NxS/T/C motifs, from 40 glycoproteins (Fig. 4a and Supplemental Table S7).

Fig. 4
figure 4

N-glycosylation site analysis of endogenous glycopeptides (<10 kDa). a Distribution of identified formerly N-linked glycosylated peptides generated after PNGase F treatment in intact or tryptic endogenous glycopeptides. Only deamidation sites within NxS/T/C motif where x is not P were considered. b Distribution of proteins identified in the glycoproteome, proteome and glycopeptidome fraction

We further performed amino acid frequency analysis in order to search for specific protease preference cleavage sites that may be generating the endogenous glycopeptides. We found that leucine is the main amino acid occurring in the the NxT motif from the glycosylation site to the N-terminal of the peptide. On the other hand, in the NxS motif we did not observe the same preference and leucine was observed in the vicinity of the S to the C-terminal direction (Supplemental Fig. 6a and b). It was demonstrated that some metalloproteases (MMPs 1, 2, 3, 7, 9, 12, 13, 14) and ADAMs (ADAM17, ADAM10) have preferences for hydrophobic residues, such as leucine in the cleavage site [5558]. Moreover, we found that leucine and isoleucine were the main aminoacids present at the N-terminus of the endogeneous glycopeptides (Supplemental Fig. 6c).

Since the peptide dabatabase search is compromised by large peptide sequences and no digestion mode, we performed tryptic digestion of the HILIC-enriched N-glycopeptides in order to increase the coverage of N-linked glycosylation sites and reduce false positives. Using this strategy we were able to increase the number of unique glycosylation sites to 134, which covered to 69 glycoproteins (Fig 4a and Supplemental Table S8).

In total, 204 unique N-linked glycosylation sites were identified in the human endogenous glycopeptide, covering90 glycoproteins. Interestingly, when we compared the overlap of proteins identified in the glycoproteome, proteome and glycopeptidome fraction, we observed that the glycopeptidome were able to retrieve exclusive proteins (57 out of 90) (Fig. 4b).

Chemical artifacts were searched using the intact glycopeptides fraction and the peptides bearing a spontaneously deamidated asparagine within the N-linked glycosylation motif were reported in Supplemental Table S1 and S2.

Conclusion

To our knowledge, the present dataset provides a highly confident characterization of N-linked glycosylation sites in human urinary proteins, including site-specific glycan microheterogeneity and for the first time the N-glycosylation sites from endogenous glycopeptides. This dataset, as well the methodology showed in this study, consist in a valuable resource for future MS-based glycoproteomic studies concerned with disorders in which N-linked protein glycosylation may be aberrant and an alternative approach for searching for novel candidate biomarkers in human urine.