Introduction

Milk contains a variety of components, including proteins, endogenous peptides, lipids, carbohydrates and minerals, all of which contribute to the growth and development of newborns. In the human diet, beyond the simple nutritional value of milk compounds, other components, including glycoproteins, antibodies and oligosaccharides, also protect infants by reducing the number of pathogen infections and promoting the development of the intestinal epithelium (de Wit et al. 1998; Coppa et al. 2006; Yekta et al. 2010; Zivkovic et al. 2011). Milk is, therefore, more than a simple source of essential nutrients. On the other hand, among allergenic foods, milk represents the main source of allergens in early childhood: approximately 2–3 % of infants younger than 1 year of age are allergic to cow’s milk proteins (CMPA). Considering that breast feeding is not always possible, indicated or sufficient, an alternative supply then becomes indispensable for infants allergic to cow’s milk proteins.

Many clinical investigations have demonstrated that donkey milk (DM) is often a good natural breast milk substitute during early infancy. In vivo, donkey milk has shown to be well tolerated by a large number (83–88 %) of children affected by cow milk protein allergy (CMPA) in terms of clinical tolerability (Restani et al. 2002; Monti et al. 2007; Swar 2011; Monti et al. 2012). In vitro, monoclonal and polyclonal antibodies produced against cow milk proteins showed a very mild cross-reaction with donkey milk (Restani et al. 2009; El-Agamy et al. 2009). Furthermore, donkey milk is known to possess natural protective antimicrobial factors and a specific epidermal growth factor (EGF) that suggest its beneficial impact on gastrointestinal mucosa health and integrity; a claim particularly valuable for children, the elderly, and convalescents, who have a reduced immune defense system (Scafizzari et al. 2009). Consequently, DM has recently aroused scientific and clinical interest, and over the last decade many investigations, carried out by different proteomic approaches (Cunsolo et al. 2014), have been focused on the characterization of both major (Cunsolo et al. 2007a, b, 2009a; Bertino et al. 2010; Chianese et al. 2010) and low-abundance proteins (Barello et al. 2008; Cunsolo et al. 2011b). Although the mechanism of this tolerance has not yet been fully clarified, it is reasonable to hypothesize that the reduced allergenic properties of DM can be related to the low-sequence similarity of its protein components with respect to bovine milk, as recently reported (Cunsolo et al. 2009b; Saletti et al. 2012).

Milk proteins can be grouped into three classes, according to their different solubilities: caseins, whey proteins and milk fat globule membrane (MFGM) proteins. Lactoferrin (LF), member of the transferrin family, is one of the most important glycoproteins (Hutchens et al. 1994) present in the whey protein fraction of milk. Its function is modulated by both the polypeptide chain and its glycosylation (Barboza et al. 2012). The N-glycan components of LF whose glycosylation pattern has been studied so far have a highly heterogeneous structure, containing N-acetylglucosamine (GlcNAc), galactose(Gal), fucose (Fuc), mannose (Man), and N-acetylneuraminic acid (Neu5Ac). N-glycolylneuraminic acid (Neu5Gc) has been found in both goat milk LF and bovine milk (Le Parc et al. 2014; Nwosu et al. 2012). N-Glycans vary widely in composition and structure, even within a single site of glycosylation. N-Glycans are divided into three main classes: high mannose, complex and hybrid (Stanley et al. 2009). Identifying the structure and glycosylation site represents a significant analytical challenge because, unlike the expression of nucleic acid and peptide polymers, glycan biosynthesis is not a transcription of a coded structure. The non-templated nature of glycan biosynthesis results in the presence of heterogeneous elongated branches and isomers. Glycosylation can modify the structural conformation of the protein and consequently its biological activity (Marth and Grewal 2008; Shental-Bechor and Levy 2009). Milk glycans can interfere with pathogen adhesion to intestinal epithelial cells (Bode 2012), which reinforces the idea that glycosylation can be involved in protecting the host against microbial and viral attacks. LF is also known to exert bifidogenic effects on some bifidobacteria. One of the possible mechanisms underlying the bifidogenic activity of LF is the provision of sugar chains (Oda et al. 2014) bound to LF. The addition of human LF or bovine LF to a sugar-restricted medium induced gene expression associated with sugar metabolism in Bifidobacteria infantis, and deglycosylated LF was detected in the medium (Garrido et al. 2012). These findings suggested that sugar chains bound to LF could be utilized as a carbon source by bifidobacteria.

Within the frame of our research on investigating donkey milk proteins (Cunsolo et al. 2011a), we have previously reported the sequence characterization and glycosylation sites identification of donkey milk LF (Gallina et al. 2016). The aim of this study was to determine the site-specific N-glycans structure in donkey milk LF, isolated by ion exchange chromatography from an individual milk sample. The combined use of chymotryptic digestion, TiO2 and HILIC enrichment, reversed-phase high-performance liquid chromatography, electrospray mass spectrometry and high collision dissociation fragmentation resulted in the identification of 26 different glycan structures.

Materials and methods

Materials

Acetic acid (AA), formic acid (FA), glycolic acid (GA), α-chymotrypsin from bovine pancreas, albumin from chicken egg white, fetuin from fetal bovine serum were obtained from Sigma-Aldrich (Milan, Italy). Ammonium bicarbonate, sodium acetate, HPLC-grade H2O and acetonitrile (ACN) were provided by Carlo Erba (Milan, Italy). Dithiothreitol (DTT), iodoacetamide (IA), ammonium acetate and sodium chloride were purchased from Fischer-Scientific (Milan, Italy). Micro-Spin regenerated cellulose filters (0.45 μm pore size) and nylon membranes (0.22 μm pore size) used for filtration of samples and ion exchange chromatography (IEC) solvents, respectively, were from Alltech (Milan, Italy). Spectra-Por Float-A-Lyzer dialysis tubes (cut off 3.5–5 kDa) were obtained from Sigma-Aldrich (Milan, Italy). TiO2 beads were obtained from GL Sciences (Japan). Polyhydroxyethyl A was provided by PolyLCINC (Columbia, MD, USA). POROS Oligo R3 was purchased from Applied Biosystems (Framingham, MA, USA). GELoader tips were obtained from Eppendorf (Hamburg, Germany). All the materials had a high degree of purity and were used without further purification. Donkey milk LF was isolated from an individual milk sample as previously reported (Gallina et al. 2016).

Enzymatic digestion of lactoferrin and glycopeptide enrichment

To obtain low molecular mass glycopeptides, the chromatographic fraction corresponding to donkey LF was reduced, alkylated and digested using α-chymotrypsin. The enzyme was dissolved in 100 mM NH4HCO3, pH 8.3, added at a molar enzyme/substrate ratio 1:50 and the solution was incubated at 37 °C overnight. The digestion reaction was stopped by cooling in liquid nitrogen. The digested mixture, containing non-glycosylated peptides and glycopeptides, was dried down and diluted in 100 μL of loading buffer (80 % ACN, 5 % TFA and 1 M glycolic acid) and added to 0.3 mg TiO2-beads. After stirring, TiO2-beads, which interact selectivity with sialylated glycopeptides (Larsen et al. 2007), were separated by centrifugation. The supernatant, containing only neutral glycopeptides and non-glycosylated peptides, was removed and stored. The beads were washed first with 50 μL of loading buffer, second with 50 μL of washing buffer 1 (80 % ACN, 1 % TFA), and third with 50 μL of washing buffer 2 (10 % ACN, 0.5 % TFA). In each step the supernatant was stored. Then the beads with sialylated glycopeptides were dried and the sialylated glycopeptides were eluted from the beads with 50 μL of elution buffer (60 μL ammonia solution (28 %) in 940 μL H2O, pH 11.3). To enrich for neutral glycopeptides, a home-made column packed with polyhydroxyethyl A 3 µm as stationary phase was used (Hagglund et al. 2004). In brief, a D10 pipette tip (Gilson) was plugged with a C18 empore disc in the bottom and the HILIC material in ACN was then applied on the top. The previous supernatant stored from TiO2 enrichment, containing only neutral glycopeptides and non-glycosylated peptides, was dried down, re-suspended in 20 µl of 80 % ACN and 2 % aqueous FA, loaded in the home-made column. The flow-through contained only non-glycosylated peptides, whereas the neutral glycopeptides, trapped in the column were eluted with 2 % aqueous FA.

Nano-LC MS/MS

Mass spectrometry data were acquired on a Hybrid Quadrupole-Orbitrap mass spectrometer (Q-Exactive Plus, Thermo Fisher Scientific, Bremen, Germany). Liquid chromatography was carried out using a Thermo Scientific Dionex UltiMate 3000 nano HPLC (Thermo Fisher Scientific). Five µL of the reconstituted samples in 0.1 % FA were loaded onto a home-made μ-pre-column (100 µm × 2 cm, 5 µm ReproSil-Pur C18). After flooding the trapping column with solvent A (0.1 % FA) at a flow rate of 5 µL/min for 5 min, the eluent was switched from the trapping column onto a home-made reversed-phase C18 column (75 µm × 17 cm, 3 µm ReproSil-Pur C18). Peptides were separated by elution at a flow rate of 250 nL/min at room temperature with a linear gradient of solvent B (ACN + 0.1 % FA) in A from 1 % to 35 % in 35 min. Eluting peptide cations were converted to gas-phase ions by the Thermo Scientific Nanospray Ion Source using a source voltage of 2.5 kV and introduced into the mass spectrometer through a heated ion transfer tube (275 °C). The mass spectrometer was operated in data-dependent mode acquisition as follows: (1) survey scans of peptide precursors from 700 to 2000 m/z, performed at 70 K resolution (@ 200 m/z); (2) MS/MS analysis, in the m/z range of 200–2000, of the eight most intense ions. Automatic gain control (AGC) target for full MS acquisitions was set to 1 × 106 with a maximum ion injection time of 120 ms. Microscans were set to 1 for both the MS and MS/MS. Dynamic exclusion was set to 15 s. Peptides of two and higher charges were fragmented in the HCD collision cell using a normalized collision energy (NCE) of 20. Subsequent MS/MS were acquired using an AGC target value of 2 × 104, a maximum injection time of 100 ms and a resolution of 17.5 K. MS data acquisition was performed using the Xcalibur v. 3.0 software (Thermo Fisher Scientific).

Glycopeptides were identified with MassAI, a freely available, in-house-developed software tool (http://www.massai.dk). The software compares the fragmentation patterns of MS/MS scans (mgf format), against a protein database and a flexible library of glycans (Li et al. 2015). The N-glycans in the library comprised hexoses (Hex), N-acetylhexoseamine (HexNAc), Fuc, Neu5Ac and Neu5Gc. The search was performed with the August 2015 version of MassAI, using default settings. This includes: chymotryptic digest with three missed cleavages, carbamidomethylation of Cys as fixed modification, oxidation of Met and deamidation of Gln as variable modifications, mass tolerance of 10 ppm at precursor level and 0.1 Da tolerance at ms/ms peak level.

Results and discussion

Donkey milk LF was isolated from the whey fraction of an individual donkey milk sample as previously described (Gallina et al. 2016). To obtain a comprehensive site-specific monosaccharide composition of the N-glycans of donkey milk LF, the isolated protein was dialyzed, reduced, alkylated and digested with α-chymotrypsin, which produces shorter glycopeptides than trypsin. RP-HPLC/nESI–MS/MS was selected for glycopeptide analysis. Despite recent advances in mass spectrometry, analysis of protein glycosylation is still very challenging, because the signals of glycopeptides are often suppressed in the presence of other peptides (Geyer and Geyer 2006; Wohlgemuth et al. 2009). For this reason, glycopeptide enrichment was performed prior to RP-HPLC/nESI–MS/MS analysis by TiO2 and HILIC enrichments as described in the experimental section.

An additional complication in the full characterization of the N-glycans of animal milk proteins is the presence of Neu5Gc, which is generally not found in human milk. Neu5Gc prevents determination of the composition based exclusively on accurate mass because combinations of Fuc and Neu5Gc yield masses isobaric with oligosaccharides containing Neu5Ac and Hex. For example, the neutral mass 1931.69 Da may not only correspond to GlcNAc4Hex5Neu5Ac1 but can also correspond to GlcNAc4Hex4Fuc1Neu5Gc1. Tandem MS is required to discriminate between the two compositions, because MS/MS spectra of glycans or glycopeptides are characterized by the presence of carbohydrate-specific oxonium fragment ions.

Therefore, the enriched mixtures of glycopeptides were investigated by RP-HPLC/nESI–MS/MS coupled online with a hybrid ESI–MS LTQ/Orbitrap mass spectrometer (Q-Exactive Plus) using HCD fragmentation and MS/MS data were used for database searching.

In the spectra obtained, diagnostic ions at m/z 308.10 [Neu5Gc+H]+ and 290.10 [Neu5Gc–H2O+H]+ indicate the presence of Neu5Gc, while ions at m/z 292.10 [Neu5Ac+H]+, m/z 274.10 [Neu5Ac–H2O+H]+ indicate the presence of Neu5Ac. In addition, ions at m/z 163.06 [Hex+H]+, 204.09 [HexNAc+H]+, 325.11 [2Hex+H]+, 366.14 [HexNAc+Hex+H]+, 657.23 [HexNAc+Hex+Neu5Ac+H]+ and 407.17 [2HexNAc+H]+, are indicative of the presence of glycopeptides. Figure 1a shows the ESI–MS/MS spectrum of the triply charged molecular ion of the glycopeptide HexNAc4Hex5Neu5Gc1 + GRNKSSAF. The experimentally determined molecular mass of the glycopeptide is 2795.113 Da, which corresponds to the theoretical one of 2795.111 Da, with an error of 0.002 Da (1 ppm). The carbohydrate-specific oxonium fragment ions at m/z 308.10 [Neu5Gc+H]+ and 290.09 [Neu5Gc–H2O+H]+ unequivocally indicate the presence of Neu5Gc1. Figure 1b shows the ESI–MS/MS spectrum of the triply charged molecular ion of the glycopeptide HexNAc4Hex5Neu5Ac1+GRNKSSAF. The experimentally determined molecular mass of the glycopeptide is 2779.116 Da, which corresponds to the theoretical one of 2779.118 Da, with an error of 0.002 Da (1 ppm). Based on the determined molecular mass, two possible glycan compositions can be suggested for the GRNKSSAF glycosylated peptide: GlcNAc4Hex5Neu5Ac1 or GlcNAc4Hex4Fuc1Neu5Gc1. However, the MS/MS spectrum confirms the GlcNAc4Hex5Neu5Ac1 compositions due to the presence of m/z 292.10 [Neu5Ac+H]+ and m/z 274.09 [Neu5Ac–H2O+H]+ ions.

Fig. 1
figure 1

a MS/MS spectrum of triply charged molecular ions of the glycopeptide HexNAc4Hex5Neu5Gc1 + GRNKSSAF in the RP-HPLC/ESI–MS/MS analysis of the α-chymotryptic digest of donkey LF. Experimentally determined molecular mass of the glycopeptide is 2795.113, which corresponds to the theoretical one 2795.111, with an error of 0.002 Da (1 ppm); b MS/MS spectrum of triply charged molecular ions of the glycopeptide HexNAc4Hex5Neu5Ac1 + GRNKSSAF. Experimentally determined molecular mass of the glycopeptide is 2779.116 Da, which corresponds to the theoretical one 2779.118, with an error of 0.002 Da (1 ppm); c MassAI identification of the glycopeptide HexNAc4Hex5Neu5Ac1+GRNKSSAF. The freely available MassAI software compares the fragmentation patterns of ms/ms scans (mgf format), against a protein database and a flexible library of glycans

The glycopeptides were determined by accurate mass measurements of the parent ions and from the MS/MS spectra, using MassAI. The MassAI identification of the glycopeptide HexNAc4Hex5Neu5Ac1+GRNKSSAF is shown in Fig. 1c as a typical example. All results were manually checked to validate the accurate mass of the potential glycopeptides and the compatibility of the fragment ions observed in the MS/MS spectra.

N-linked glycosylation in mammals occurs via the amide group of asparagine in the consensus tripeptide sequence Asn-X-Ser/Thr, where X can be any amino acid except proline (Bause and Hettkamp 1979). Occasionally, N-glycans are found at Asn-X-Cys, providing that the cysteine residue is in the reduced form (Stanley et al. 2009). The primary structure of donkey LF presents three Asn residues in position 137, 281 and 476 that satisfy the consensus tripeptide sequence Asn-X-Ser/Thr and two Asn in position 168 and 513 that satisfy the consensus tripeptide Asn-X-Cys. Among the five potential sites of glycosylation, only Asn in position 137, 281 and 476, involved in the consensus tripeptide sequence Asn-X-Ser/Thr, was found glycosylated. In this study, the distribution of N-glycans among the three glycosylation sites was identified, as shown by the representative glycopeptides reported in Table 1, thus providing for the first time the determination of the most comprehensive monosaccharide composition of the N-glycans of donkey milk LF. The results summarized in Table 2 show that Asn 281 and Asn 476 are the sites bearing the largest number of different glycan structures in donkey LF, with 21 and 17 different N-glycan structures, respectively. It should be noted that the N-glycan compositions identified can be related to a typical glycan structure, although from our data it is not possible to differentiate between isomers.

Table 1 Identified donkey milk lactoferrin N-glycopeptides
Table 2 Identified donkey milk lactoferrin N-glycans at Asparagine 137, 281 and 476

Conversely, Asn 137 shows a minor number of N-glycan compositions (6 glycopeptides). Moreover, five of the six N-glycan compositions identified were also found linked to Asn in position 281 and 476. 13 N-glycan compositions were found in common between Asn in positions 281 and 476, while 12 N-glycan compositions were found distributed differently between Asn in position 281 and 476.

Altogether the N-glycan compositions determined in donkey milk LF revealed that most of the N-glycans identified are neutral complex/hybrid. Indeed, ten neutral non-fucosylated complex/hybrid N-glycans and four neutral fucosylated complex/hybrid N-glycans were found. In addition, two high mannose N-glycans, four sialylated fucosylated complex/hybrid N-glycans and six sialylated non-fucosylated N-glycans, one of which with Neu5Gc, are present (Table 2).

The glycosylation profile of LF in human, bovine and goat has been described (Yu et al.2011; Barboza et al. 2012; Nwosu et al. 2012; Le Parc et al. 2014). The human LF (hLF) sequence contains three potential glycosylation sites that correspond to the consensus tripeptide sequence Asn-X-Ser/Thr, at position Asn 138, Asn 479 and Asn 624. All these three sites were found to be glycosylated, even if the third site (Asn 624) is mostly not glycosylated (vanBerkel et al. 1996). In addition, Yu et al. (2011) showed that the glycan profile at Asn 138 is more complex in comparison to that at Asn 479. In bovine lactoferrin (bLF), all the five potential glycosylation sites, which correspond to the consensus tripeptide sequence Asn-X-Ser/Thr, at Asn 252, Asn 300, Asn 387, Asn 495 and Asn 564 were found to be glycosylated, with Asn 252, Asn 387 and Asn 495 most differently glycosylated. Conversely, Asn 564 and Asn 300 were less differently glycosylated with significantly fewer glycopeptide products. To the best of our knowledge, there are no published reports about the occupancy of goat lactoferrin glycosylation sites. Sequence analysis reveals the presence of five potential sites (Asn 233, Asn 281, Asn 368, Asn 476 and Asn 545) involved in the consensus tripeptide sequence Asn-X-Ser/Thr just like in bovine lactoferrin. From the comparison of the N-glycan types (Tables 3, Supplementary Table S1), it appears that high mannose N-glycans were found only in bovine, goat and donkey milk LF, whereas this structure is not present in human milk LF. The majority of the N-glycans in donkey milk LF are neutral N-glycans (Table 3). A similar ratio is found in bovine milk LF and in goat milk LF, while in human milk LF a higher number of sialylated N-glycans is observed. In addition, the ratio of fucosylated types in donkey milk LF is similar to that of bovine milk LF and goat milk LF, while it is lower than in human milk LF. In addition, the number of sialylated types is similar to goat milk LF and higher than in bovine milk LF. Furthermore, among the sialylated types in goat milk LF, eight contain Neu5Gc, which is generally not present in humans. In contrast, only one type containing the Neu5Gc was found in donkey milk. Prior to this study, Neu5Gc was only found in goat milk LF (Table 3), whereas a recent work (Nwosu et al. 2012) suggested its presence in bovine milk (Supplementary Table S1). As a whole, the comparison shows that donkey LF glycosylation composition is markedly different with respect to human and bovine, while it is closer to that of goat, the most relevant difference being the higher presence of Neu5Gc in goat LF. It has been demonstrated that different glycan profiles result in marked differences in immunogenicity and allergenic properties (Almond et al. 2013) and, therefore, the knowledge of the monosaccharide composition of the N-glycans is a necessary prerequisite for investigating the allergenic properties of Lfs.

Table 3 Comparison of N-glycan structures among human, bovine, goat and donkey milk lactoferrin

Conclusions

Current interest in LF is related to its biological properties, including antibacterial, antiviral, antioxidant activities, iron-binding and immunomodulation. However, the mechanisms of the antibacterial activity of lactoferrin, including the role of glycosylation in the protection against pathogen infection, have not been completely elucidated yet.

The present paper reports the most comprehensive monosaccharide composition of the N-glycans of donkey LF isolated from an individual donkey. The data obtained allowed identifying 26 different N-glycan structures linked at residues 137, 281 and 476. Most of the N-glycans identified are neutral complex/hybrid. In fact, ten neutral non-fucosylated complex/hybrid N-glycans and four neutral fucosylated complex/hybrid N-glycans were found. In addition, two high mannose N-glycans, four sialylated fucosylated complex N-glycans and six sialylated non-fucosylated complex N-glycans, one of which with Neu5Gc, are present. The site-specific monosaccharide composition of the N-glycans elucidated in this study could enable future investigations on the relationship between glycosylation pattern and protein function.