1 Introduction

Pollen analysis has been used in many fields of application such as environmental monitoring (Ribeiro et al. 2015), agriculture (Cunha et al. 2016), paleobotany (Seddon et al. 2019; Schopf et al. 2016), forensic science (Orijemie and Israel 2019; Pereira et al. 2020) and medicine (Lo et al. 2019; Medek et al. 2019). Pollen is one of the most common triggers of season allergic reactions, in some individuals when inhaled causes symptoms due to the proteins that it carries and the numbers of individuals suffering from allergies has grown exponentially in the last years (Sedghy et al. 2017),

Traditionally, pollen identification and quantification are performed manually by light microscope, a process that is time consuming and requires a trained observer to perform it objectively. Moreover, some important pollen types inducing allergies, like Phleum or Dactylis (García-Mozo 2017) cannot be identify to the genus level. The identification task would benefit with a faster and more resolved identification of pollen species and this is an area where research has been carried out along the years (Rittenour et al. 2012; Sharma-Ghimire et al. 2019).

Image-based applications have been used for pollen identification and other biological particles for a few years (France et al. 2000; Ranzato et al. 2007). It is based on microscopic image analysis through image processing detection techniques, and the introduction of texture characterization in the identification has led to improvements in the classification performance of the distinct pollen types (Marcos et al. 2015). More recently, the implementations of real-time automatic pollen recognition systems based on image processing techniques (Oteros et al. 2015) and digital holographic images (Sauvageat et al. 2020), have showed good results in the online identification of a number of pollen taxa.

Besides pollen morphological features that provide a good taxonomic distinction at the family, genus and even in some cases at the species level, pollen grains also present several differences concerning molecular features and chemical composition that allow identification (Depciuch et al. 2018; Zimmermann 2018). Recently, new methods in pollen identification and quantification have been developed, foreseeing automatic pollen identification (Šantl-Temkiv et al. 2020).

DNA-based techniques have been used for pollen identification and quantification in order to substitute the traditional methodology. The air samples analyze are collected by standard methods, and the DNA extraction occurs afterward following optimized methodologies (Rojo et al. 2019; Bell et al. 2016) specially because the pollen DNA extraction is challenging and some problems involving pollen abundance quantification may need other resolution (Baksay et al. 2020). Some studies have showed that this method could provide accurate qualitative discrimination among grass species (Brennan et al. 2019; Kraaijeveld et al. 2015) that until now were not possible to distinguish with image processing techniques.

Several spectrophotometric techniques have been tested and applied aiming automatic pollen identification. Fourier transformation infrared spectroscopy (FTIR Muthreich et al. 2020; Xu et al. 2018; Zimmermann and Kohler 2014), ultraviolet light induced fluorescence (UV-LIF) (Ruske et al. 2018; Forde et al. 2019), fluorescent spectroscopy (Mularczyk-Oliwa et al. 2012; Zhang et al. 2019) and Raman spectroscopy (Wang et al. 2015). At first, the different approaches were used only to discriminate bioaerosol, and in some cases even pollen, from other materials, biological or not, present in the air. In particular, the UV-LIF technique has evolved to a more elaborated system to distinguish between pollen families and genus. Fluorescence-based equipment is being used in the discrimination of materials in the air, nonbiological and biological compounds are easily distinguished due to intrinsic characteristics; however, bioaerosols like pollen and fungal spores are proving more challenge (Forde et al. 2019). Bağcıoğlu et al. (2015) tested 7 different FTIR and Raman spectroscopy methodologies to the same pollen samples and conclude that Raman microspectroscopy measurements, which are focused on the corpus region of pollen grains, achieved one of the best taxonomic-based differentiation of pollen.

The detection system and the collection of the data are just a part of an automatic pollen identification protocol, the data analysis/classification (Okwuashi and Ndehedehe 2020) is presently one important subject. Development in data science has given a valuable input into pollen classification based on pollen spectroscopic features. Some studies are using machine learning techniques for classification as supervised learning, where the whole data set is divided into training and testing set, as it happen in SVM (support vector machine), NN (neural networks) or k-nearest neighbors, while others choose unsupervised learning algorithms, where the data are analyzed as 1 group as, e.g., hierarchical cluster analysis or k means (Swanson and Huffman 2020).

The use of Raman spectroscopy in the pursuit of automatic pollen identification is not a new research field (Mondol et al. 2019; Wang et al. 2015; Ivleva et al. 2005; Schulte et al. 2008) but recent developments in term of classification algorithms, high-throughput screening (Mondol et al. 2019) and possible identification of airborne pollen (Doughty and Hill 2020; Guedes et al. 2014) can allow the increase in single pollen’s spectra resolution and therefore better discrimination of pollen samples.

Raman spectroscopy is a nondestructive technique, that doesn’t require sample preparation, which comes as an advantage to other techniques suggested for pollen identification and yet it is possible to analyze aqueous or air samples with minimal interference (Weiss et al. 2019; Guedes et al. 2014).

Raman spectroscopy evolved along the years, in the beginning was used to identify pollen of known samples (controls) to separate them and in the testing of different wavelengths to ascertain the best suited for pollen (Ivleva et al. 2005). Also, the bands in a Raman spectrum are characteristic and may be assigned to specific chemical compounds which makes it possible to discriminate them. The assignment of the pollen spectrum bands and the correlation of distinct pollen taxa that they seem to generate another important use of Raman spectroscopy to pollen identification and characterization (Schulte et al. 2008). Pollen grains have a characteristic of high fluorescence spectrum, and that has been a working issue that researchers using Raman spectroscopy must address. To enhance the information extracted and reduce noise, a variety of spectrum preprocessing techniques have been used as a baseline correction, normalization and smoothing (Fukuhara et al. 2019).

Additionally, Raman parameters obtained after deconvolution of the spectrum such as the wavenumber and other parameters as the intensity, the integrated intensity and the FWHM (full width at half maximum of the band) remarks to chemical compounds of the pollen wall and can be characteristic for a specific taxa.

Therefore, in this work we aim to evaluate the capacity of the Raman parameters of pollen spectra to be used as future technique in pollen automatic identification by simplifying the data acquisition and reducing the volume of information to analyze. We tested the use of parameters of only 7 common band intervals for all pollen species tested and used support vector machine with a data science software for the classification.

2 Material and methods

2.1 Pollen collection

The pollen samples analyzed by Raman microspectroscopy were collected, during the flowering season in 2018, in the Porto city, from gardens of the Faculty of Sciences of the University of Porto campus and in public parks. 15 different pollen species were analyzed from trees, shrubs and weeds (Table 1) consider to induced allergic reactions (Galán et al. 2017). Three plants per each species were sampled, and flowers/catkins were randomly collected from all quadrants of the plants, in different branches, until a small plastic box was filled. After separation of the anthers from the other plant structures, the anthers were dried at 25 °C during 24 h, after that time, shivered through different grades of sieves to separate the pollen from the rest of the plant materials. Pure pollen was then collected. The samples were stored at − 20 °C until analysis (Ribeiro et al. 2017).

Table 1 Pollen analyzed in the study, divided in type of plant and in pollen family

2.2 Raman spectra acquisition and processing

Before the analyses, pollen samples were taken from the storage and left 10 min at room temperature.

Raman spectra were acquired by an XploRA™ Raman microscope (Horiba Scientific, France) that combines optical microscopy with a Raman spectroscopy using a laser radiation which allows a “one shot” analysis. A 100 × objective lens was used to focus the laser beam on the sample and also to collect the Raman scattered radiation in backscattering geometry. The Raman signal was detected on a highly sensitive cooled charge-coupled device (CCD) detector.

Prior to each measurement, the Raman spectrum wavenumber was calibrated using a Si reference standard (520.6 ± 0.1 cm−1). Pollen samples were placed on a glass slide, and for each species 3 spectra from 3 different pollen grains were collected at an excitation wavelength of 785 nm from a diode laser at a power of 25 mW with a range of diffraction gratings with 1200 lines mm-1 and slit of 300 μm. Extended scans were performed, with 5 scans of 50 s each measured on each pollen grain, in a spectral region from 1000 to 1800 cm−1 with approximately 1 cm−1 resolution.

Raman spectra were preprocessed involving an automatic polynomial baseline correction to attenuate the fluorescence influence followed by a denoise procedure using the Savitsky–Golay algorithm to increase spectra quality. The spectra were then normalized to a constant area, where the area under the curve is set to 100 (a.u.).

Afterward, each spectrum was deconvoluted using a mixed Gaussian–Lorentzian curve-fitting procedure to determine the precise Raman parameters: wavenumber (W), full width at half maximum of the band (FWHM) and integrated intensity (A). To reduce the influence of the natural variability of the intensity of the spectrum a new parameter was calculated, R_area (pondered area), the ratio between the integrated intensity with the total integrated intensity of the deconvolution curve. For the fit of the spectral sets, 18 bands were used, which correspond to the aggregate of principal bands present in the distinct pollen spectra.

The software LabSpec 6 (Horiba Scientific, France) was used for spectra acquisition and deconvolution.

2.3 Data analysis and pollen classification

A matrix with all the Raman parameters obtained for each of the 18 bands considered in the deconvolution process was created. Only the seven common band intervals to all Raman spectra, chosen by visual inspection were used in the classification of the pollen species. The common band intervals were the ones in the fingerprint areas: 1000–1010, 1300–1460 and 1500–1700 cm−1 and therefore the data matrix presented a total of 21 feature classifiers.

For the pollen classification analysis, it was used the open-source Orange 3.24.1. software package, with tools for data visualization and analysis, data mining and machine learning (Demsar et al. 2013).

The potential of Raman parameters to accurately classify the distinct pollen species was evaluated applying a supervised learning algorithm—SVM (support vector machine). SVM is based on the concept of finding a design function that best separates the analyzed features in different groups. A hyperplane represents that separation and the best hyperplane is the one that maximizes the distance between features and therefore gives the best classification or regression. This represents a linear classifier, but its usual to find nonlinear distribution for the data, and in that case, kernel functions are used (set of mathematical functions that allow for a nonlinear decision surface to be transformed into a linear higher dimensional space), the objective remains but the hyperplane adjusts differently to the data.

Our data matrix is composed of 45 spectra × 21 classifier features and in order to minimize fitting problems at the classification algorithm, we randomly divided the data into 2 sets, one training group with 66.7% of the spectra (2 per pollen species, 30 total spectra) to estimate the best classification model and a separate testing group with 33.3% of the study cases (1 per pollen species, 15 total spectra). A radial basis function (RBF) kernel was selected and the tuning of its γ and c parameters was performed by testing several combinations until the best train-test classification was met (c = 1.6 and γ = 0.05). RBF kernel is a commonly used general kernel functions in SVM classification and is defined as KRBF(x, x′) = exp[− γ‖x − x′‖2]. The γ parameter allows to define how far is the influence reach of a single training example, while the c parameter (common to all SVM kernels) will act as a trade-off between a correct classification of training examples against maximization of the decision function’s margin, the smaller the value of c the larger margin will be accepted at the cost of training accuracy (Sammut and Webb 2011). The precision (ratio of correctly classified objects to all object that should truly be correctly classified) and classification accuracy (ratio of correct classification to total classifications made) attained was analyzed, and a confusion matrix (two-dimensional table, where one dimension corresponds to the true class of an object and the other to the class that the classifier assigns) was used to summarize the performance of the classification algorithm (Sammut and Webb 2011).

3 Results and discussion

3.1 Spectra analysis

Raman spectra give information about the pollen chemical characterization, containing specific signals of macronutrients such as lipids, proteins, carbohydrates, water and even some pigments (Zimmermann 2010; Schulte et al. 2008; Zimmermann and Kohler 2014; Bağcıoğlu et al. 2015; Pummer et al. 2013; Kenđel and Zimmermann 2020; Weglinska et al. 2020). As a result, the spectra are quite complex and variable between different genera and even species, which can also be noticed in our results.

The Raman spectra obtained for the 15 different pollen species show distinct 18 bands, characteristic of each pollen species, and were selected due to being the ones that improve the deconvolution fitting line, in the functionality region between 1000 and 1800 cm−1 (Fig. 1), but only 7 band intervals were common to all studied species, distributed in three fingerprint regions (Fig. 2). In fact, the average Raman spectra present some differences between the studied species, being possible to distinguish particularities between the spectra of tree and grass species.

Fig. 1
figure 1

Example of a spectrum of Artemisia vulgaris with a total of 18 bands deconvolution

Fig. 2
figure 2

Average Raman spectra of the 15 pollen species analyzed and main fingerprint areas marked in gray (1000–1010, 1300–1460 and 1500–1700 cm−1)

These 3 fingerprint regions are defined by the following 7 common band intervals at about [1000–1010 cm−1], [1305–1335 cm−1], [1340–1375 cm−1], [1440–1460 cm−1], [1525–1600 cm−1], [1600–1615 cm−1] and [1650–1665 cm−1].

In the 1500–1700 cm−1 fingerprint area, the bands assigned to nucleic acids (adenine and guanine) by Diehn et al. (2020) were found mostly at ≈ 1580–1590 cm−1 for trees and at approximately 1565 cm−1 in the grass species. The exception is Anthoxanthum adoratum with a peak ≈ 1530 cm−1, assigned to carotenoids (Diehn et al. 2020), this is the band with more heterogeneity of peak values.

The tree spectra also present a well-defined band in the interval [1600–1615 cm−1] and 1 or 2 less intense bands (at 1525–1600 and 1650–1665 cm−1), one before and other after one higher intensity peak in the region [1600–1615 cm−1], most of times showed as shoulders more or less defined, while in grasses, 2, 3 or 4 medium–low intensity bands are observed in the same region. The band with the higher intensity in all tree spectra, is around 1608 cm−1 and has been assigned to mitochondrial activity but also to the ferulic acid and coumaric acid building blocks in sporopollenin (Diehn et al. 2020). This band is present in all pollen species analyzed, less in Lolium perenne where the higher intensity peak appears at ≈ 1600 cm−1, and has been assigned to phenylalanine and tyrosine (Guedes et al. 2014) or to ring stretches of phenyl structures (Ivleva et al. 2005).

A common band observed in the interval [1650–1665 cm−1] was the one at ≈ 1662 cm−1 that has been assigned to vibrations of proteins (Diehn et al. 2020; Schulte et al. 2008) and is present in all pollen species with the exception of Artemisia vulgaris pollen where a band is observed in the 1650 cm−1 position and may be assigned to Amide I system (C = O) (Guedes et al. 2014; Ivleva et al. 2005).

The fingerprint area between 1300–1460 cm−1 can be considered characteristic of grass species. In this region, compared with other species, a wide band with high intensity is observed in the interval [1340–1375 cm−1], with most grass species presenting the peak at ≈ 1370 cm−1, the exception is Dactylis glomerata. For tree species, this area is quite different, being observed a set of smaller bands around 1360 cm−1, that can be assigned to nucleic acids (adenine and guanine) (Diehn et al. 2020). Also, in this fingerprint area (1300–1460 cm−1) 2 peaks are found at ≈ 1450 cm−1 (in all species) and at ≈ 1313 cm−1 (in all trees and in Dactylis glomerata pollen) that correspond, respectively, to deformation made of C–H2 groups of aliphatic carbon chains (Guedes et al. 2014) and to ferulic acid and coumaric acid building blocks in sporopollenin. In the other grass species, the peak is shifted to ≈ 1322 cm−1, that is associated to carbohydrates (Diehn et al. 2020).

Finally, in a third fingerprint area (1000–1010 cm−1) a band at ≈ 1006 cm−1 is characteristic of all pollen and can be assigned to carotenoids (Diehn et al. 2020; Schulte et al. 2008).

3.2 Classification analysis

The Raman spectra complexity and variability between distinct pollen types also enables its identification and classification, by applying data analysis, to taxonomic levels that are many times not possible to be discriminated by humans under light microscopy (Kraaijeveld et al. 2015; Mondol et al. 2019).

Our study assessed the possibility of the Raman parameters of the seven common bands to all pollen species to be sufficient for the classification process, a different approaches to what has been usually done in other studies that use information of the full or reduced spectral range. We must highlight that this was a preliminary proof-of-concept for the methodological approach using a smaller spectra data set.

The classification potential was evaluated in three combinations: all 15 tested species data sets, and by plant´s habitat only tree species and only grass species. The best classification possible of these data sets is achieved when used the Raman parameters of the wavenumber (W), full width at half maximum of the band (FWHM) and integrated intensity (A) (Table 2). In our study, the R_area (pondered area) parameter did not improve the classification.

Table 2 – Confusion matrix resulted from the SVM analysis on test step of the Raman parameters (wavenumber, full width at half maximum of the band and integrated intensity) of the 7 common wavenumbers from the Raman spectra of the pollen from 15 plant species

The classification performance using all pollen species was very high, being perfect in the training step with a classification accuracy (CA) of 100% and a precision of 100%, while in the testing step 14 out of the 15 pollen species were correctly assigned (precision of 90% and CA of 93.3%). The exception was Salix atrocinerea pollen, which was misclassified as Acer negundo. It was possible to perform the distinction between pollen from Betula pendula and Corylus avellana, 2 taxa belonging to the same family presenting very similar morphologies, which can pose some classification challenges for the methods based on image processing (Sauvageat et al. 2020).

Comparing our classification accuracy with the ones obtained when the full or reduced spectral range is used by other authors in the pollen discrimination (Diehn et al. 2020; Ivleva et al. 2005; Zimmermann and Kohler 2014), it is possible to see that the Raman parameters can be a good alternative to pollen good classification, but it must be kept in mind that our spectra data set is small. As observed by Schulte et al. (2008), even though pollen taxa related to the genus and family level present chemical similarities, which are indicative of both phylogenetic relationship and mating behavior, in our study it was possible to discriminate between the distinct pollen species.

Using the information of only seven common band intervals, we are able to reduce the volume of data necessary to classify the pollen species as well as the time of analysis and spectra acquisition due to the small spectral region studied.

Table 3 shows the confusion matrix for the tree pollen species classification corresponding to the test step.

Table 3 Confusion matrix resulted from the SVM analysis on test step of the Raman parameters (wavenumber, full width at half maximum of the band and integrated intensity) of the 7 common wavenumbers from the Raman spectra of the pollen from tree plant species

The trees CA declined a little compared when all species were used, but the majority were accurately classified. The exceptions were Salix atrocinerea that remains misclassified as Acer negundo and now Alnus glutinosa is classified as Quercus robur. For this analysis we obtained a CA of 77.8% and a precision value of 66.7%. As it happens for the total species SVM analysis, in the train step, the CA and precision value were 100%.

It is interesting to observe that Alnus pollen was correctly discriminated when all studied species were considered. When we use only the Raman parameters of common band in tree pollen, we are distinguishing among more similar spectra. The Raman spectrum of Quercus robur pollen in the 1300–1460 cm−1 fingerprint areas has much more similarities to the grass spectra and therefore when grasses are included in the training step the Q. robur would be set further from the tree species. Additionally, for the same fingerprint area, Salix atrocinerea and Acer negundo spectra are very similar and for that these similarities can cause the CA decline.

When only grass species are tested, the classification renders the best performance with all species being correctly classified in both the training and testing steps (Table 4), with a CA and precision value of 100%.

Table 4 Confusion matrix resulted from the SVM analysis of the Raman parameters (wavenumber, full width at half maximum of the band and integrated intensity) of the 7 common band intervals from the Raman spectra of the pollen from 5 grass species

Through high-throughput screening Raman spectroscopy (HTS-RS), Mondol et al. (2019) used the Raman spectra fingerprint region (758–1800 cm−1) from pollen of 15 genera belonging to the Poaceae family and applied PCA-SVM for their classification. The predictions among Poaceae genera were high (around 79% accuracy and sensitivity of 80%), but the number of pollen grains/species analyzed was much higher compared with our study, which justify the lowest classification performance. In our study, we tested a small data set to ascertain the possibility of the Raman parameters to be sufficient for the classification process.

We tested also if the Raman peaks observed for Poaceae pollen species in the fingerprint area 1300–1460 cm−1, with distinct spectral features among the tested species, could be enough for a correct classification among them. It was observed that it was not sufficient for a good classification, in the train step the value of accuracy was 80% and the precision of 83.3% where A. adoratum was misclassified as H. lanatus and H. lanatus as L. ovatus.

Distinction among airborne Poaceae genera, or even species if possible, is important in terms of pollen-related allergy issues. Grass species are one of the most common and higher allergenic species and their wide distribution around the globe as well as number of species, causing several allergic reactions in susceptible individuals (García-Mozo 2017). However, not all grass species induce allergies, but a few genera like Lolium spp., Dactylis spp., Anthoxanthum spp., Phleum spp., among others, are the most allergic ones (Brennan et al. 2019; García-Mozo 2017). With an extensive flowering season, that lasts around 4–5 months between March–July and September, and with several annual peaks in airborne pollen concentration it makes months of suffering for grass pollen allergen suffers (Ribeiro and Abreu 2014). Presently, the grass airborne pollen season is not discriminated by the different genera or species, and the pollen season of the most allergenic ones may be common to other type of pollen season what may enhance the allergic individual reaction (García-Mozo 2017). So, it becomes clear the importance to identify the different airborne pollen contributors, and among the Poaceae it is a real challenge to exactly defined the traits of the flowering seasons, beginning and ending, for the different genera (Brennan et al. 2019). In fact, the morphological similarities among the airborne pollen from the different Poaceae genera makes almost impossible their distinction. Features such as number of apertures, shape and texture are quite similar, posing great analytical challenges to image processing algorithms (Ronneberger et al. 2002), although Poaceae pollen morphology is so typical that are easily distinguished among other airborne non-Poaceae pollen. With our study it was possible to distinguish between 5 different species of Poaceae by using the Raman parameters of only 7 common band intervals from the full pollen Raman spectrum.

In our study, we also tested the contribution of each Raman parameter or their different combinations in pollen classification (Fig. 3).

Fig. 3
figure 3

Contribution of each Raman parameter (wavenumber (W), full width at half maximum of the band (FWHM) and integrated intensity (A)) or their combination (graphic on the right) and each of the 7 common band intervals (graphic on the left) in the pollen classification performance. Values of the precision obtained using SVM analysis in Orange software with the same configuration used in the classification of the species. Both train and test sets of data were analyzed

It was observed that the combination of two or more parameters gave better results than using only a single parameter. In Fig. 3 (left side), we can see the precision values obtained in the training and testing steps.

Considering each Raman parameter alone, the integrated intensity (A) is the one that less contributes to the classification (in training: 59% and in testing: 37%). The wavenumber (W) and full width at half maximum of the band (FWHM) showed equal precision in the testing step (63%) but the FWHM achieved best classification in the train step (90%).

When the parameters are combined in groups of 2, the W + FWHM gave better performance than the other combinations. However, only the combination of all parameters allows the best classification with 100% CA in the training step and 90% in the test one.

Finally, with the question in mind if all the 7 common wavenumbers are important to the discrimination of the pollen from the 15 studied species and therefore avoid overfitting of the classification algorithm, we tested removing the parameter’s data of each wavenumber considered at a time without any other change in the remaining data. It was observed that all wavenumbers are important for the correct classification of the pollen species. CA in the training step and even less in the test one was very low when any wavenumber is removed (Fig. 3).

One interesting observation was when we test the removal of only the last band interval [1650–1670 cm−1], the discrimination in the train step is negatively affected although in a smaller percentage (precision value of 94%) when compared with the removal of the other intervals.

The results shift from no misclassification, to 50% wrong classification for a few species but all trees (Alnus glutinosa pollen misclassify as Betula pendula, Salix atrocinerea as Acer negundo, Corylus avellana as Fraxinus angustifolia) being still possible to make distinction between the tree and the grass species. This behavior was not observed when all the other intervals were removed at a time, and grasses were misclassified as trees and vice versa. So, the band interval [1650–1670 cm−1], seems to contribute to the classification between trees and grasses as well as among the different tree species. This band has been assigned to the vibrations of proteins (Diehn et al. 2020; Schulte et al. 2008) and the differences can be due to distinct molecular conformations arrising from distict protein content in each species.

All parameters are important in the classification, the wavenumber values are one of the most important though this parameter alone can be tricky. The calibration made in the equipment it is basically a calibration of the wavelength, and that if not taken as a routine can induce differences in this parameter. Zimmerman et al. (2014) described small shifts in the wavenumber position, even in pollen spectra of the same species in different geographical regions. In fact, in our study, in the test group the performance of this parameter alone is not good.

The proposed methodology in our study could be a promising approach for Raman-based automatic pollen classification, however, one drawback in the small data set used in the training and testing of the classification algorithm. It would be interesting to test in the future the efficiency of the 7 common band intervals in discriminating between the studied pollen types using high-throughput analysis methodology.

4 Conclusion

Our study focused on testing the possibility of using the band Raman parameters: wavenumber (W), full width at half maximum of the band (FWHM) and integrated intensity (A) instead of the all spectrum into pollen classification. All parameters are important in the classification, with the wavenumber and FWHM, contributing the most to the classification.

The results obtained proved to be possible, using the Raman parameters of 7 band intervals, common to all pollen types, to achieve a successful classification of different pollen species. Fourteen out of 15 pollen species were discriminated including some that are morphologically very difficult or even impossible to identified by the human eye, e.g., between 5 Poaceae species and between 2 species of Betulaceae, as Betula pendula and Corylus avellana.

It would be interesting to further test the proposed methodology using a larger number of species, including fresh pollen and more Poaceae species, as well as the minimal acquisition time to still achieve a precise classification.