Introduction

Proteomics, defined as the in-depth analysis of protein repertoires within a given species, organ, or organelle, has considerably extended in the last few years our knowledge of an ever expanding number of proteins [1]. As a result, more than 550,000 proteins have now been described in the literature, from a total of more than 63 million protein sequence entries currently referenced in public databases [2]. Whereas proteomic methods are broadly applied to the deciphering of numerous physiological and pathophysiological mechanisms [37], we focus in the present review on their current concrete applications to the field of allergy [8••]. Food and respiratory allergies as of today represent a major public health burden, with an increasing prevalence [912, 13•]. In this context, proteomics is of great interest for the identification and structural characterization of allergens involved in such allergies. Also, immune epitopes derived from such allergens can be defined using technologies such as X-ray diffraction or hydrogen deuterium exchange (HDX) mass spectrometry (MS). Moreover, proteomics can provide semi-quantitative or quantitative information regarding those molecules, for example, within food or environmental samples, as well as pharmaceutical-grade natural extracts used for allergen-specific immunotherapy (AIT). Proteomic approaches are also useful for the in vitro diagnostic of allergen sensitization and further for searching biomarkers of efficacy for AIT. Herein, we review each of those applications of proteomics to allergen identification, characterization, quantification, and quality testing, as well as the discovery of biomarkers of AIT efficacy.

Proteomics Dedicated to Allergen Identification

Identification is critical for the detection of allergens not only in food and beverages but also within AIT products from the biopharmaceutical industry. Hundreds of known allergens are proteins or glycoproteins [14••]. Those allergens are commonly identified based on their IgE reactivity and subsequently classified as minor or major allergens on the basis of the prevalence of IgE sensitization to them [15]. In the early 1980s, the combination of modern biochemistry with molecular biology gave rise to the identification of an exponential number of new allergens that were further purified, sequenced, and expressed as recombinant proteins. Those allergens originating from either animal dander, foods, mites, pollens, whole insects, or derived venoms and yeasts were initially partially sequenced by Edman degradation to identify the first 10 to 30 amino acids, thus subsequently allowing to perform gene cloning and sequencing [1619]. Within the last two decades, MS and most particularly tandem MS (MS/MS) have progressively and advantageously replaced Edman degradation in order to obtain entire allergen sequences [2026]. MS also contributed to the elucidation and detection of a number of post-translational modifications such as N-glycosylation, among which some important cross-reactive carbohydrate determinants encompassing for instance the galactose-alpha-1,3-galactose motif [21, 2734].

Immunological methods have been the gold standard for allergen identification or detection and are routinely used to this aim in many laboratories performing dot blots or western blots [35, 36]. The selectivity of the latter methods relies upon antibody specificity, associated with either a polyclonal or preferably a monoclonal antibody (mAb). However, the production of suitable mAbs is cumbersome and difficult since both the specificity and the stability of the obtained product are never guaranteed a priori. Moreover, matrix interference, cross-reactivity, and other molecules such as antibodies or lectins can interfere or abolish the detection of the allergen of interest [37]. Aptamers (i.e., oligonucleotides or peptides) selected for their ability to bind to a specific target molecule represent an attractive alternative to mAbs, but this technology is currently not widely available [38]. In this context, MS carries significant advantages by providing a virtually reagent-free allergen identification, based on the acquisition of MS/MS data from the allergen of interest prior to an in silico comparison with protein sequence databases. For instance, a single multiplexed MS method can specifically detect egg, milk, and soy allergens, with a level of detection ranging from 0.1 to 2 μg/g of food sample [3941, 42•].

In the last few years, allergen identification has benefited from combined approaches associating proteomics with transcriptomics and IgE immunoblotting. Bi-dimensional polyacrylamide gel electrophoresis (2D-PAGE) combined with MS is a method of choice to discover new allergens, especially major allergens [4345]. Briefly, the proteome from a biological sample (e.g., aqueous mite or pollen extract) is first subjected to isoelectric focusing (i.e., first dimension) to fractionate proteins based on their charge, with further separation according to molecular masses (i.e., second dimension). As a result, hundreds of proteins are usually resolved as small spots within the polyacrylamide gel, which can further be transferred to a membrane by western blotting. The membrane is then revealed by sera from allergic patients, highlighting IgE reactive spots subsequently characterized by MS/MS. When limited or no data are available in protein databases, the use of transcriptomic information (i.e., data from RNA sequencing) is highly valuable to identify new allergens of clinical importance. This combined approach based on omics technologies was successfully implemented by us and others to characterize allergen extracts obtained from either house dust mites or pollens from tropical grasses or ragweed [4648]. Moreover, we applied MS as a release testing identification method to confirm that a drug product made of a mix of five grass pollens contains in a consistent manner grass pollen group 1 major allergens (namely, Ant o 1, Dac g 1, Lol p 1, Phl p 1, and Poa p 1) originating from each of the selected grass species [31].

Unraveling Allergen Epitopes Through Proteomics

Understanding the allergen/antibody binding interaction (as well as potential cross-reactivity) through the mapping of epitopes represents an essential component in the development of both immunoassays as well as AIT drug products [4951]. To this aim, complex methodologies such as nuclear magnetic resonance spectroscopy (NMR) or X-ray diffraction were successfully implemented to characterize a small number of allergen epitopes [5255]. Antibodies raised against allergen-derived peptides were also successfully used to compete with the patients’ IgEs in binding to the native allergen, thereby identifying immune epitopes [56]. Broader information can also be obtained through the simple production and testing of synthetic overlapping peptides that cover the entire allergen sequence [5759]; however, this approach fails to identify most conformational epitopes [60].

HDX-MS can be used to locate linear and conformational epitopes in a manner that complements classical structural approaches [6163]. HDX-MS probes the solvent accessibility of proteins in their native state based on the rate of exchange of backbone amide hydrogens (H) against deuterium (D). In this regard, the binding of a mAb to a target allergen reduces the solvent accessibility of the epitope leading to a mass reduction in the complex compared to the free allergen. The technology has been established several years ago and is now available on fully automated instruments in order for instance to develop and compare biologicals such as therapeutic mAbs [64].

In our hands, we have used HDX-MS to identify an epitope specifically recognized by a Der p 1-specific mAb (namely, 5H8 from Indoor Biotechnologies, Cardiff, UK). Briefly, Der p 1-5H8 complexes were obtained at 30 °C prior to labeling with a deuterated PBS buffer. Over the time course of the experiment (ranging from 1 min to 1 h), aliquots of the complex were recovered, quenched, and dissociated prior to MS analysis. Our HDX-MS data confirmed that the 5H8 mAb recognizes a conformational epitope located in the B domain of Der p 1 (Fig. 1a), composed of segments Thr48-Ala57, Tyr82-Tyr93, and the Ser102-Ile113 loop. An equivalent epitope was identified by X-ray crystallography (Fig. 1b), in agreement with the HDX-MS approach [6567]. Such data demonstrate that conformational epitopes can be identified by measuring the change in deuterium uptake between the free and bound allergen. Compared to X-ray crystallography, HDX-MS results are of medium resolution (i.e., 8–10 residues) but the technology requires low quantities of biological material (i.e., 5–10 pmols per injection) and provides a rapid (1 to 2 days) and efficient way to map epitopes.

Fig. 1
figure 1

Comparison of the Der p 1 epitope targeted by Fab 5H8 as elucidated by a HDX-MS and b X-ray crystallography. The binding of Fab 5H8 reduces the solvent accessibility of three regions within the B domain of the Der p 1 allergen, covered by peptides Tyr48-Ala57 (red), Tyr82-Tyr93 (green), and Ser102-Ile113 (blue). Equivalent regions were identified by X-ray crystallography. The amino acid residues involved in hydrogen bonding upon complex formation are also displayed. An excellent agreement between the two epitope mapping strategies was observed

Proteomics for Allergy Diagnostic

Proteomic approaches can also be used to perform molecular allergy diagnostic. On a routine basis, the anamnesis combining symptom assessment and skin prick testing data is the main cornerstone of allergy diagnostic [6870]. The latter relies as well upon specific IgE in vitro binding assays, making use of microarrays of purified allergens as part of a “component resolved diagnostic” in order to improve routine clinical care. As of today, microarrays carrying over 100 distinct allergens offer the opportunity to characterize patients’ IgE sensitization patterns to multiple allergens in a single analysis with small blood volumes and distinguish true sensitization from cross-reactivity [71, 72•, 73]. The clinical interest of such technologies, including point-of-care or near-patients diagnostics, is however currently limited by the availability of highly purified, well-characterized, and stable allergens. In the near future, proteomic-based miniaturized devices allowing a more accurate, faster, and simpler diagnostic of allergic sensitization will likely contribute to the emergence of personalized AIT tailored for individual allergic patients [74•].

In this context, new nanoscopic scale biosensors are being developed to assess patient’s circulating IgEs from as little as 50 μL of blood sample. For instance, the abioscope apparatus (Abionic SA, Lausanne, Switzerland) is a novel small footprint device that allows to quantify allergen-specific IgEs (e.g., directed to either Can f 1, Der p 1, Fel d 1, or Phl p 1) in 5 min, thanks to modern nanofluidic biosensors that enhance molecular interactions and reduce incubation time from hours to minutes. Results obtained with this technology compare to the ones generated using the Immunocap technology (Thermo Fischer Scientific, Uppsala, Sweden) [71].

Proteomics to Document Pharmaceutical Quality

AIT was shown to restore appropriate immunoregulatory responses in allergic patients, thereby alleviating clinical symptoms and reducing the uptake of symptomatic drugs [7582]. As of today, AIT treatments rely upon standardized allergen extracts obtained from natural source materials (e.g., mites, pollens) produced and tested as per health authorities recommendations [83, 84]. The guideline on allergen products (CHMP/BWP/304831/2007) provides European manufacturers with recommendations in terms of quality for the production of allergen products intended for the diagnostic or treatment of allergic diseases [85]. Importantly, health authorities request a specific identity test for the allergenic source materials used for drug manufacturing. To address this topic, electrophoretic (PAGE) or immunological (western blotting, enzyme-linked immunosorbent assay (ELISA)) methods are frequently performed, even though the specificity of those approaches is not always fully documented. We and others have shown that a fast and simple MS acquisition can identify and distinguish several pollen source materials [86].

Biotyping based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-ToF) MS has been originally applied for the reliable and swift identification of pathogenic microorganisms in clinical and veterinary microbiology [8789]. Using this straightforward methodology, we could identify, in a reproducible manner, source materials from multiple species including insect venoms (i.e., Apis melifera, Poliste spp., Vespula spp.), molds (i.e., Alternaria alternata, Aspergillus fumigatus, Aspergillus niger, Cladosporium herbarum, Cladosporium IHEM, Penicillium notatum), grass and cereal pollens (i.e., Agrostis, Anthoxanthum odoratum, Avena fatua, Avena sativa, Bromus hordeaceus, Cynodon dactylon, Dactylis glomerata, Hordeum vulgare, Lolium perenne, Phleum pratense, Poa pratensis, Secale cereale, Triticum, Ventenata dubia, Zea mays), tree pollens (i.e., Chamaecyparis obtuse, Cryptomeria japonica, Juniperus ashei, Olea europaea) and house dust mites (i.e., Blomia tropicalis, Dermatophagoides farinae, and Dermatophagoides pteronyssinus). This technology, in addition to being user-friendly, requires only few milligrams of product to allow unambiguous source material identification. As illustrated in Fig. 2, even closely related species can be identified and distinguished on the basis on their unique molecular compositions, when assessed by MALDI-ToF MS proteomics. We thus believe that the latter method may be suitable for release testing of most allergenic source materials.

Fig. 2
figure 2

Composite correlation index matrix visualization (Biotyper, Bruker Daltonics) of 31 different allergenic extracts assessed by MALDI-ToF MS (AutoFlex Speed, Bruker Daltonics). Reddish (hot) colors mark closely related species. Bluish (cold) colors mark non-related species. Proteins from the allergenic extracts were resuspended in 70 % formic acid solution and sonicated. One microliter of sample was deposited on the MALDI target with 1 μL of α-cyano-4-hydroxycinnamic acid matrix solution and dried prior to MALDI-ToF acquisition

In the interest of allergic patients, the quality of AIT biologicals must be properly documented with respect to their composition, consistency, and stability using state-of-the-art and validated analytical methods. Specifically, proper and consistent allergen dosing is critical to guarantee AIT safety and efficacy. To this aim, allergenic extracts must be standardized based on their potency, which reflects their ability to bind IgEs from allergic patients. In addition, such biological drug products must contain consistent, defined, and clinically efficacious amounts of major allergens [9092, 93•, 94]. Despite a sustained interest in purified recombinant allergens to perform AIT, natural standardized allergen extracts remain the only authorized therapeutic option, although documenting pharmaceutical quality of the latter is far more complex when compared with the former [95]. Immunological methods, especially ELISA, are routinely used to assess major allergen content of natural extracts [94]. The prerequisites for this method rely on the specificity of mAbs as well as the quality of the reference standard, and the concern arose that some allergen proteoforms (or isoallergens) might not be properly quantified by antibody-based assays [91, 93•].

In this context, MS-based allergen quantification methods were recently developed to circumvent potential specificity issues associated with ELISA [9699, 100•, 101]. Currently, this methodology relies on the quantification of peptides derived from the allergen following enzymatic digestion. Briefly, the allergenic extract (or food or beverage sample) is reduced and alkylated to disrupt disulfide bonds within the allergens, thereby facilitating their proteolysis by trypsin. Resulting proteotypic peptides are subsequently separated by liquid chromatography and quantified by triple quadrupole MS/MS. As developed in the 1990s, the reference standard can be one (or more) synthetic isotopically labeled peptide with a sequence identical to the one from the protein to be quantified [102]. Because the latter analytical procedure assumes that 100 % of the allergen is digested during sample preparation, an alternative and preferred method is rather based on standard addition of non-labeled intact allergen as a reference standard. The main advantage of this latter technique is that it provides accurate and comprehensive allergen quantification, irrespective of the extent of target allergen digestion. As a result, based on this standard addition method, we found a 30-fold increase in absolute quantification of grass pollen group 1 allergen with MS when compared with results obtained using a dedicated ELISA, likely as a consequence of the variability of grass pollen allergens with some isoallergens poorly recognized by either one of the mAbs used in the ELISA [31, 103]. Overall, the most compelling reasons to recommend the implementation of allergen quantification by MS for release testing of AIT drug products are the unrivaled specificity and comprehensiveness of this method.

Identification of Biomarkers of AIT Efficacy

There is currently a growing interest in identifying biomarkers guiding the physician’s decision to initiate, continue, or terminate AIT [104108]. A biomarker is defined as a molecule that is detected or quantified in the body fluids to differentiate a patient from a healthy individual or to document the impact of a treatment [109]. As of today, very few studies related to AIT have revealed candidate biomarker molecules correlating with clinical benefit at an individual patient level [110]. Such biomarkers would be most valuable for many actors, including but not limited to the physicians and patients as well as health authorities and payers. The search for such biomarkers now benefits from the combined use of omic technologies, as part of a “panoromic” approach [111•].

We have applied a combination of proteomic methods, namely, 2D differential gel electrophoresis (2D-DiGE) and label-free MS, in order to identify candidate biomarkers of clinical efficacy by comparing the proteomes of various subtypes of effector and regulatory human dendritic cells (DCs). Briefly, 2D-DiGE consists in labeling the proteins from the samples to be compared with different dyes and further subjecting the labeled proteins to 2D-PAGE. Following separation, gel images were acquired with a wavelength specific of each dye and statistically compared for protein abundance. Similarly, label-free MS consists in a statistical analysis of the amount of proteins within the compared sera and also allows quantifying hundreds of proteins. Those two orthogonal and semi-quantitative methods evidenced, among other molecules, complement component 1 (C1Q) and the receptor stabilin-1 molecules as candidate markers of the clinical tolerance induced by grass pollen AIT [112]. In combination with qPCR, an extensive label-free MS study revealed at least five proteins that are differentially expressed in DC2s and DCreg cells, thereby confirming that AIT modifies key components of the innate immune system within 2 months of treatment [113]. These semi-quantitative methods were further applied to compare sera from grass pollen allergic patients enrolled in a double-blind placebo-controlled study performed in an exposure challenge chamber [114]. As a result, we observed differences in post-translational modification of serum α-2-HS-glycoprotein (or fetuin A) when comparing sera from patients exhibiting clinical responses with weak AIT responders (manuscript in preparation).

Overall and despite the fact that the identification of biomarkers based on proteomics is still a lengthy, resource-demanding, and complex process, it remains a unique approach to identify proteins or protein isoforms (e.g., glycoforms) that represent candidate biomarkers of AIT efficacy. For this reason, proteomics is usually applied to a limited number of sera (up to 100) and more simple methods (e.g., ELISA, qPCR) are subsequently applied to a larger number of patients to validate those molecules.

Conclusions

Current allergy care can benefit from many applications of proteomics, whether through specific allergen detection in numerous matrices, in vitro diagnostic, or documentation of the quality and consistency of biological products intended for safe and efficacious AIT. In this context, we speculate that mass spectrometry and hyphenated techniques (including powerful biocomputing dedicated to data mining) will play an expanding role in the field of allergy. During the next decade, we believe that proteomics will keep paving the way for (i) improved understanding of the pathophysiology of allergic diseases, (ii) unambiguous identification and characterization of allergens, (iii) highly specific and comprehensive allergen quantification, and (iv) enhanced molecular diagnostic including the identification of efficacy biomarkers. Lastly, it is also through the use of state-of-the-art proteomics, likely in combination with genomics and next-generation sequencing, that second generation allergen immunotherapy drug products will be made available for allergic patients.