Introduction

Research in food science and nutrition has exponentially grown recently, changing the way food is considered. In fact, food is not just a simple source of energy for the body, but it provides components with specific functions and nutritional properties, which also include potential benefits as well as possible detrimental effects on health. Functional compounds include flavonoids, phenolic acids, vitamins, ω3-fatty acids, glucosinolates, but also proteins and peptides [1]; the last two of these make up one of the main groups of food bioactive components, and the study of their nutritional value is part of a new emerging field, namely nutritional proteomics (or nutriproteomics) [2]. Recent substantial improvements and innovations in analytical chemistry have led to the development of a large variety of analytical techniques for the isolation and purification of low-concentrated compounds and to the development of high resolution mass spectrometry (MS). Such a technique is able to carry out the analysis of extremely complex mixtures, detecting different types of analytes over a wide range of concentrations [3].

The advent of shotgun proteomic technologies provided the tools suitable for the discovery and identification of bioactive peptides (BPs). Shotgun proteomics is a bottom-up proteomics approach in which a complex protein mixture is specifically digested into peptides with an enzyme and then analyzed by a combination of high-performance liquid chromatography and MS to provide parent proteins after bioinformatic analysis of experimental spectra. However, despite the improvements in shotgun proteomics analysis, challenges still remain and are mainly connected to the analytical complexity of nutritional proteomics studies. In fact the diverse selectivity and specificity at the food protein processing and digestion level, conferred by the multiple in vitro and in vivo specific and unspecific proteases, produce very complex peptide mixtures, also with short peptide sequences [4].

Food-derived BPs are made up of short amino acidic sequences, inactive inside the parent protein, but that can be released by endogenous proteases or during gastrointestinal digestion, food processing, and storage or by in vitro hydrolysis by specific proteolytic enzymes [1]. They usually contain 2–20 amino acid (AA) residues per molecule, but in some cases they may consist of more than 20 AAs. BPs are usually classified into small peptides (less than 7 AAs, the most active but difficult to analyze by conventional proteomics approaches), medium peptides (7–25 AAs) and large peptides (more than 25 AAs) [5].

BPs possess a wide range of biological activities, which are reported in the BIOPEP database (http://www.uwm.edu.pl/biochemia/index.php/pl/biopep) and comprise antimicrobial, anti-hypertensive, cholesterol-lowering, anti-inflammatory, antithrombotic, and antioxidant activities. Their functional properties make BPs potentially useful as components in functional foods, nutraceuticals, food-grade biopreservatives, cosmetics, and pharmaceuticals [5].

The most investigated matrices for peptidomic studies are milk and its derivative products. The precursors of milk and derivative product BPs are proteins, such as casein (CN). In particular human milk contains 50 % CN and 50 % whey, whereas bovine milk is richer in CN (80 %). CNs are divided into α-, β-, and κ-CNs. The primary CNs in milk are αs1-CN, αs2-CN, β-CN, and κ-CN. Whey proteins are β-lactoglobulin, α-lactalbumin, immunoglobulins, glycomacropeptides, bovine serum albumin, and other minor proteins [6, 7].

Numerous milk-derived peptides are known to exhibit biological activities, such as opioid-like, antithrombotic, and antimicrobial activities. The largest group of opioid peptides, i.e., β-casomorphins, are mainly released from β-CN; bovine κ-CN can release casoplatelins, which are a group of antithrombotic peptides which can inhibit platelet aggregation or fibrinogen binding processes; finally an antimicrobial activity was ascribed to lactoferricin B, released by lactoferrin proteins [8]. Table 1 lists some examples of BPs derived from milk and its most important derivative products.

Table 1 Examples of bioactive peptides derived from milk and dairy product proteins for which the biological activity has been proved (adapted from refs. [4, 9, 10])

Human milk is the most studied source of BPs; however, other mammalian milk types, such as cow, sheep, goat, and donkey milk, are also of interest in BP research. The main interest in animal milk resides on its resemblance to human milk, which makes it a possible substitute for infant consumption [7, 11].

Given the interest in BP research and the pivotal role of milk and derivative products in this field, this trends article describes applications of peptidomics and other modern approaches for peptide analysis of milk and dairy products, particularly focusing on fractionation, detection, and quantification of BPs.

Different approaches for bioactive peptide discovery

Considering the current literature [2, 12, 13], the most employed approaches in peptide discovery can be classified as illustrated in Fig. 1.

Fig. 1
figure 1

Commonly employed approaches in the discovery of bioactive peptides from food proteins

The classical or empirical approach, also referred to as the in vitro method [14], consists of a series of steps: (1) selection of an appropriate food protein source; (2) isolation of proteins (or of peptides in the case of endogenous peptide screening); (3) for isolated proteins, release of peptide fragments by the proteolytic action of endogenous enzymes, exogenous enzymes (e.g., hydrolysis by digestive enzymes), or by food technological processes (such as ripening and fermentation); (4) preliminary bioactivity screening; (5) purification and fractionation; (6) further determination of the biological activity of isolated peptides; (7) peptide identification by MS analysis; (8) in vivo or in vitro validation of biological activity.

The empirical approach is still the most used; however, it requires intensive sample preparation (sample pretreatment, fractionation, and purification). Finally it does not always enable unambiguous identification of single BPs. To overcome the major drawbacks of the empirical approach, bioinformatics-driven (in silico) approaches have recently been introduced. The bioinformatic approach enables the construction of profiles of the potential biological activity of protein fragments, the calculation of quantitative descriptors to estimate potential precursor proteins of BPs, and the prediction of bonds susceptible to hydrolysis by endopeptidases in a protein chain. In this way, after sample fractionation, it is possible to simplify the complexity of the initial sample and assessing the biologically active peptide becomes less laborious and time-consuming with respect to traditional bioactivity assays [1, 2, 11, 1517]. This approach provides a more restricted list of candidate BPs to subject to further validation, which requires the synthesis of these peptides and their individual bioactivity test.

The bioinformatic approach exploits the information provided by various databases, such as BIOPEP [18], PepBank, PeptideDB, the antimicrobial peptide databases (APD2) and the Collection of AntiMicrobial Peptide (CAMP) [1, 5], to assign a biological activity to identified peptides. The peptide activity classes found in PeptideDB and the BIOPEP database include antimicrobial peptides, cytokines and growth factors, peptide hormones, and toxin/venom peptides, whereas APD2 and CAMP are limited to antimicrobial peptides, such as antiviral, antifungal, antibacterial, and antiparasitic peptides. These databases include AAs along with peptides that contain 2–14 AAs or more. The most employed database in food analysis is BIOPEP. However, given the interest in milk and dairy product BPs, recently Théolier and his co-workers established a specific and new database, MilkAMP, which contains 371 entries (9 hydrolysates, 299 antimicrobial peptides, 23 peptides predicted as antimicrobial, and 40 non-active peptides) [19].

For the investigation of the bioactivity of peptides isolated in cheese from raw and pasteurized ovine milk, Pisanu and co-workers employed Enzyme-Predictor (http://bioware.ucd.ie/~enzpred/Enzpred.php) and the BIOPEP database to successfully predict potential BPs. From this analysis 37 of the 187 identified sequences were ascribed with an immunomodulating and ACE inhibitor activity and showed differences in the specific sequences and in their relative amounts between the two investigated cheese samples [20]. The BIOPEP database was also used to find out which enzymes accounted for the release of a series of antimicrobial peptides; the bioinformatic analysis allowed one to theoretically predict the possible proteolytic cleavage sites in CN and identify thermolysin and thermolysin-like enzymes as likely candidates to liberate the antimicrobial peptide caseicin A from αs1-CN [21].

Another interesting example of a bioinformatics-driven approach was provided by Guerrero et al., who mechanistically analyzed and identified 700 endogenous BPs in human milk. Using the computational tool Peptide Extractor, they were also able to detect the site-specificity of proteolysis [22].

The bioinformatic approach provides several advantages over the classical approach for BP discovery; however, peptides are recognized as being bioactive only when the specific sequence is recognized by the database search. Thus the reliability of bioinformatic data is strictly dependent on the employed database. Validation of attributed BP sequences is important; however, the investigation of the structure–activity relationship of peptides longer than four AAs is hindered by the high cost of chemical synthesis (the average cost currently ranges between 7.5 and 10 US$ per gram per amino acid residue). The principal component contributing to the high cost of longer peptide sequences is the starting amino acids rather than solvents and solvent recycling, which also contribute to production costs. In this context, future research should be directed towards the development of peptide array and microarray technologies. These kind of technologies allow one to obtain peptides by photolithographic peptide synthesis on a glass surface and the SPOT synthesis of peptides on membrane supports. These approaches offer the ability to economically generate a large number of longer peptides in a single experiment, allowing one to screen BPs and probe their interactions with host molecules on a large scale. Thus peptide arrays and microarrays may help select peptide sequences and identify potentially therapeutic or nutraceutical peptides, with increased high throughput [12].

The two approaches described so far represent two extreme methods for BP analysis; however, most of the works described in the literature fall between the two of them and employ what can be defined as an integrated approach, which is a combination of the approaches described above. Most of the examples selected for this trends article are of the latter type and for a comprehensive description of the works employing an integrated approach we refer the reader to a dedicated review [13].

Peptide separation strategies

Given that peptide biological activities depend on the molecular weight and the amino acid sequence and that most known BPs are short 2–6-AA sequences, one of the most important factors in in vitro BP studies is the selection of an analytical system able to separate peptides with the desired molecular weight. In this regard, ultrafiltration membrane systems are good, feasible, fast, and economic devices to separate small peptides with the desired molecular weight by choosing the appropriate molecular weight cutoff (MWCO; e.g., 0.5, 1, or 3 kDa) [23, 24]. Despite the advantages of ultrafiltration membranes, it has also been reported that they are poorly reproducible and could remove peptides below the stated MWCO [25]. In fact Capriotti and co-workers reported that for fairly complex samples the use of ultrafiltration is not advantageous for the purification of smaller peptides and causes the removal of some apolar peptides [26], among which it is highly likely to find BPs. This observation and the developed analytical strategy can potentially be extended to any food matrix that is not extremely complex; thus different methods directly employing a combination of chromatographic techniques coupled with high resolution MS analysis, bioinformatics, and database searches become a viable and attractive alternative for BP screening.

Currently the most employed methods at laboratory scale involve the use of chromatographic techniques, such as size exclusion chromatography (SEC), ion exchange chromatography (IEC), hydrophilic interaction chromatography (HILIC), solid-phase extraction, preparative reversed-phase (RP) high-performance liquid chromatography, and affinity chromatography [4]. The choice of a particular chromatographic purification technique is generally carried out on the basis of the peptide physicochemical properties. For example, for the retention of hydrophilic peptides (polar peptides) HILIC performs better than RP chromatography (RPC) and thus it represents a versatile, effective alternative. IEC separates the peptides on the basis of the charge and thus it offers a different selectivity; however, one of its main drawbacks in a complex mixture is related to the fact that many peptides could generate the same charge, with subsequent poor separation. Regarding SEC, it has been predominantly used in the last year and favored for routine and validated analyses because of its speed and reproducibility, but there are challenges to interfacing SEC with MS. Moreover, the dramatic improvements in resolution, sensitivity, and throughput due to the use of smaller particle size have enhanced the SEC capability.

Capillary electrophoresis was used as an alternative, versatile, and less time- and sample-consuming method in hypoallergenic infant milk formulas [27, 28]. In this regard, monodimensional (1D) approaches usually cannot provide adequate resolution, but significant improvements can be obtained with two-dimensional liquid chromatography (2D-LC) methods. 2D-LC coupled to MS/MS is currently considered the technique that offers the maximum separation efficiency and represents one of the preferred choices for bottom-up proteomics and peptidomics. Briefly, 2D-LC can be “comprehensive” when the whole sample is subjected to the two distinct separations or “heart-cutting” if only a part of the sample eluting from the first dimension is sent to the second one [29].

A representative example of how 2D-LC workflows could provide a valuable contribution to BP analysis was provided by the work of Sommella and co-workers in the separation of peptides of milk-soluble fractions after their expiration date [30]. They used an online comprehensive LC × UHPLC platform and compared the results to those of a classical highly efficient 1D separation with the same analysis time, showing that peak capacity and resolution can be greatly enhanced in 2D-LC. Other examples using 2D-LC separation are well described in a review by Sanchez-Rivera and co-workers [4].

The separation and purification of bioactive peptides which will involve development of automated and continuous systems is an important field for food chemists. Much effort has been given to develop selective column chromatography methods that can replace batch methods of salting out or solvent extraction for BP isolation and purification. Advancement here would improve BP recovery and would enable one to produce functional food with such peptides or employ them for specific nutraceutical applications.

Short peptide sequence analysis

Emerging on-line multidimensional chromatographic systems can significantly improve the separations of peptides, leading to an increase in the number of identified peptide sequences.

High mass resolution and accurate measurements of precursor mass-to-charge ratios provide more specificity and a lower false positive ratio for the same number of true positives during database searches of peptide tandem mass spectra. The use of high resolution MS is a prerequisite for peptide synthesis and further validations. However, short peptide sequences pose additional issues related to transfer and over-fragmentation of low molecular mass ions. Conventional mass spectrometric approaches for short peptide analysis involve the use of chemical derivatization, multiple reaction monitoring (MRM), or sequence tags for analysis of such peptides. Recently Lahrichi et al. developed an LC–MS/MS method based on an MRM strategy to identify 117 peptides with 2–4 AAs. Despite many of them being isobaric and co-eluting, about 60 % of them were uniquely identified [31].

However, these approaches require additional and laborious steps prior to MS/MS analysis [32]; thus, alternative MS-based approaches were proposed. Nanostructure laser desorption/ionization (NALDI) is one of them [33], which was employed for identification of low molecular weight peptides derived from bovine milk and colostrums. Other than MALDI, NALDI is a matrix-free techniques and does not suffer from matrix background in the mass range below m/z 700; as a result, mass spectra of small peptides are obtained with high sensitivity and very low chemical background. A synergistic effect can be obtained by coupling information from chromatography with MS analysis. In this regard a significant example was provided by Le Maux and co-workers who exploited HILIC separation to differentiate peptides with homologous sequences by linking the retention time to the apparent hydrophilicity coefficient and to peptide size with the use of a specific algorithm [34]. Finally, an important contribution can be provided by improvements in the analytical instrumentation; for instance, simple ultrafiltration with a 5-kDa MWCO followed by nanoUPLC analysis coupled to high resolution MS/MS allowed one to directly analyze 17 short peptide sequences without any chemical derivatization [32].

Endogenous BP analysis

Methods and applications described so far focused on the analysis of BPs generated after in vitro simulated digestion, which have been extensively studied. However, the in vitro digestion approaches cannot reveal the endogenously produced peptides that are present in milk and derivative products, which remain poorly investigated. The analysis of endogenous BPs poses additional challenges to the ones discussed above. First, isolation techniques must address the enrichment of naturally occurring peptides in the matrix, and adaptations of the previously discussed protocols can tackle this issue. More challenging is the identification by conventional proteomics approaches, which may not completely fulfill the needs of peptide mixtures of partially unknown origin. First, protein databases are often incomplete when related to organisms which are not completely sequenced; such cases require the search to be extended from a single organism to a genus or family. Second, BPs may be produced during consumption or processing, namely by events which are more complicated than a simulated digestion and which may not be completely elucidated. Endogenous peptides can be produced by a variety of ways and thus they differ from the tryptic peptides generated in typical proteomics experiments because of the variable mode of action of endogenous proteases (which, in turn, may be present in the considered organism, such as proteases in mammary glands for milk, or derived from other organisms, such as microorganisms in milk-derived products) or completely unspecific processes (e.g., thermal treatments). In this latter case identification by matching peptide and fragment masses to sequence databases becomes complicated because enzyme specificity is missing or lacking at all; possible solutions to this issue are de novo sequencing and protease unspecific database searches. De novo peptide sequencing from fragment ion spectra does not need to create or match fragment ion spectra to lists of fragment ions obtained in silico from protein sequence databases by predefined rules, such as tryptic cleavage. For this reason, the method allows one to identify atypical digested peptides. Another possibility to identify atypical peptides is the unspecific database search (namely a search without any enzyme for predefinite digestion); in the latter case all possible cleavage sites are considered for generating fragment ion lists from protein sequence databases. With this second approach the search space is much larger than a search with a selected digestion rule, and the possibility of false identifications is increased; thus, manual inspection of spectra is needed [2, 4].

Despite the intrinsic challenges for endogenous peptide analysis, in the last year some studies have suggested that the peptide fraction of raw milk contains a great variety of BPs which are mainly released during microbial fermentation, thermal treatment, or storage steps. Two similar studies were performed on cow [35] and donkey milk [36]. In both studies complementary analytical workflows were applied to obtain the largest number of identifications. The identified peptides were ascribed to the most abundant proteins (αs1-CN for cow milk, β-CN and αs1-CN for donkey milk). In both studies no enzyme was specified for peptide identification and, for the case of donkey milk, the genus was considered to tackle the lack of complete sequencing.

Outlook

The research focused on milk and its derivative BPs is a field which has been undergoing great development, but there are still many points which need further development to allow improved peptide isolation and separation (robust, efficient, sensitive, and cost-effective techniques), new strategies for very short peptide (less than 5 AAs) identification, and new algorithms for bioactivity prediction. Moreover individual variability must be taken into account, especially for the discovery of BPs using simulated gastrointestinal protocols. Human digestion is a complex process wherein ingested food is broken into nutrients and both mechanical and enzymatic processes take place. In the static in vitro models, proteins are sequentially exposed to conditions that simulate mouth, stomach, and intestine environments (different pH values and enzymes). Static models are an oversimplification of the reality, in which many of the physical processes that occur in vivo are not taken into account [37].

Despite the progress, the information available is mainly related to in vitro data and there is limited clinical evidence to justify the production and the use of BPs as nutraceuticals or functional food components. Nevertheless, the development of the study of BPs is of great interest for pharmaceutical and nutraceutical applications. Although the potential of milk proteins and peptides for the formulation of functional foods has been long demonstrated [38], further efforts are needed to increase their commercialization, with the development of economically feasible methods for the large-scale production of bioactive milk components.

The main challenges to commercialization of BPs are due to the little attention paid to their bioavailability and biodistribution after ingestion and also to inadequate clinical evidence of bioefficacy.

In vitro approaches could help in reducing animal studies, but validation against in vivo studies is essential. Moreover, evidence of the mechanism of action as well as knowledge about dose and toxicological studies is fundamental for approval by regulatory authorities (European Food Safety Authority, European Medicines Agency, Food and Drug Administration). Large-scale clinical studies similar to the ones used for drugs should be pursued, taking into consideration how these peptides behave in the gastrointestinal tract, the amount which is absorbed and enters circulation, the distribution and transformation of the original peptide, and excretion. Some of these challenges have been considered. For instance, delivered BPs can sometimes be damaged as a result of proteolytic attack. Chemical modification of the peptide backbone has been used to increase the stability of peptides in biological fluids. This is achieved via techniques such as amidation, polymer conjugation, and the introduction of disulfide bonds [39]. Moreover, new delivery strategies employing nanotechnology (i.e., macroencapsulation and nanoencapsulation methods) have shown that they not only preserve peptide stability in food and during digestion but can also improve BP delivery to target tissues [38].

For these reasons future research should focus on the in vivo study of stability, availability, and accessibility of identified BPs as well as their biodistribution and absorption.

Once all this knowledge is available, a commercializable product must be devised, considering peptide taste and palatability as well, to provide a formulation which is good (either alone or as an additive in functional food) for the consumer and stable. Taste evaluation can be facilitated by the development of new instrumental sensors for taste or by the use of cell assays, as suggested by recent works in this field [12]. Finally, a large-scale extraction or production is needed for commercialization of the final product. In this regard, the development of BP applications for pharmaceutical and nutraceutical products is already a field of interest. In particular, the development of digestion protocols for BP production can provide a means to enhance the value of food industrial by-products (such as whey). Thus, although the potential of milk proteins and peptides for the formulation of functional food has been long demonstrated [39], further efforts are needed to increase their commercialization, with the development of economically feasible methods for the large-scale production of bioactive milk components.