1 Introduction

Airborne pollens form just a small fraction of the atmospheric bioaerosol loading (Pope et al., 2010; Huffman et al., 2012; Pöhlker et al., 2013), but despite their low number concentrations these particles are of significant importance because of their impact on human health (Rönmark et al., 2009; Zuberbier et al., 2014; Gilles et al., 2020), agriculture and sylviculture (Dhlamini et al., 2005; Isard et al., 2011), as well as on climate through their role in the hydrological cycle (Frohlich-Nowoisky et al., 2014). Pollen monitoring networks exist in most countries, providing information to a range of end-users from allergy sufferers and their doctors through to researchers. The current standard that is used across these networks is based on manual technology developed in the early 1950s (Hirst et al., 1952; Galán et al., 2014; CEN/EN 16,868:2019, 2019) that is both time-consuming and laborious. Furthermore, this measurement technique suffers from several shortcomings, including the fact that data are delivered at low temporal resolution (usually daily averages) with a delay of between 3 and 9 days from the time of observation.

New technologies developed over the past few years have made it possible to automatically measure pollen at high temporal resolutions and in real time (Crouzy et al., 2016; Kawashima et al., 2017; Sauliene et al., 2019; Oteros et al., 2020; Sauvageat et al., 2020; Tesendic et al., 2020). The provision of such observations is dramatically transforming the information that can be provided to end users, particularly allergy sufferers and medical practitioners, ensuring better diagnoses, treatment, and development of new immunotherapies and medication. Furthermore, numerical models can integrate these real-time measurements, as is currently done for meteorological parameters, potentially improving pollen forecasts considerably (Sofiev, 2019). Combined, this could help to reduce the large burden on the public health system, currently estimated to cost between €50–150 billion per year in Europe (Wickham et al., 2019) and only likely to increase as the proportion of allergy sufferers continues to grow. The availability of high temporal resolution observations of pollen and fungal spores is also opening up a number of new research avenues, for example, to better understand the sub-daily relationship between airborne pollen concentrations and meteorology (Chappuis et al., 2019) or allergic patient symptoms and exposure levels.

Several automatic pollen monitors have been developed and tested over the past few years (Kawashima et al., 2007; Crouzy et al., 2016; O’Connor et al., 2013, 2014b; Sauliene et al., 2019; Oteros et al., 2020). While all of these instruments have been validated separately compared to the manual Hirst method, to date no systematic evaluation has been made to fully understand the capabilities of these technologies. Here we present the results from a campaign carried out in Payerne, Switzerland, where all automatic pollen monitors available on the market at the time were tested in parallel. The campaign was run during the 2019 pollen season with observations from all instruments available from 19 April to 31 May.

This paper provides a brief overview of the instruments used, a description of the test site and the data processing applied, and then a discussion of how the instruments perform in terms of total pollen counts. Individual pollen taxa are not investigated since neither are all instruments able to provide this information nor are the recognised taxa always the same between devices that are able to make such distinctions. Finally, we summarise the results and provide insight into possible future developments in the domain of automatic pollen monitoring.

2 Material and methods

2.1 Measurement site

The 2019 measurement campaign was carried out in Payerne, Switzerland (46°48′48″ N, 6°56′35″ E), which is located in a rural area on the Swiss plateau (490 m above mean sea level) and is exposed to a wide variety of pollen commonly found in central Europe. All instruments, manual and automatic, were placed on the roof of the MeteoSwiss building at a height of 7 m above ground level within metres of each other (see Fig. 1 in Chappuis et al., 2019, for an image of the site).

Fig. 1
figure 1

Daily average total pollen counts for the period of study: 19 April–31 May 2019. The time series were scaled to match the mean of the Hirst measurements (i.e. ratio mean Hirst:mean device X) to better compare the pollen season peaks measured by each instrument

2.2 Instruments used

Several instruments that can identify biological aerosol particles existed on the market in 2019 (Huffman et al., 2019; Sauliene et al., 2019). In this study we tested devices from five manufacturers that are able to identify airborne pollen in real time: the Droplet Manufacturing Technology WIBS-NEO, the Hund-Wetzlar BAA-500, the Plair Rapid-E, the Swisens Poleno, and the Yamatronics KH-3000. These instruments were compared to the current manual standard measurement technique using a Hirst-type trap, all of which are further described in this section. In all cases, total pollen concentrations (i.e. number of pollen grains/m3) were considered.

2.2.1 Hirst-type traps

Two volumetric Hirst-type samplers (Hirst et al., 1952), the current standard used in aerobiological networks across the world (CEN/EN 16,868:2019, 2019; Galán et al., 2014), were operated in parallel as reference for the automatic instruments. These samplers each contain a rotating drum which collects particles by impaction by pumping air through the instrument at a nominal rate of 10 L/min. Later tests using a resistance-free flow-meter showed that the real flow rate was on average 13.5 L/min, so all values were corrected by a factor of 1.35 to overcome this bias (i.e. the total pollen concentrations were decreased by 35%). Pollen counts were carried out following the standard MeteoSwiss operational monitoring procedures using an Olympus BX45 optical microscope at 400 × magnification. Two longitudinal lines were counted, covering an area of 5.1% of the total surface of the slides. While this is below the recommended minimum of 10% proposed by the European standard EN16868:2019, a recent study has shown that even such recommendations may not be sufficient to capture all peaks observed when the full sample is counted (Mimic & Sikoparija, 2021). Total pollen was obtained by summing all individual taxa counted (see Table S1).

2.2.2 Droplet measurement technology WIBS-NEO

The WIBS (Wideband Integrated Bioaerosol Spectrometer) has gone through several different versions and has been used previously for a wide variety of research studies across the globe (e.g., Kaye et al., 2007; Foot et al., 2008; Gabey et al., 2010; Healy et al., 2014; O’Connor et al., 2014a; Feeney et al., 2018; Forde et al., 2019; Duflot et al., 2019; Daly et al., 2019).The WIBS-NEO is the latest in the series of WIBS instruments and was developed and manufactured by Droplet Measurement Technologies (DMT). The instrument uses a 635 nm laser to determine optical particle size and asymmetry, while two xenon lamps, which are triggered by the sizing laser signal and filtered at 280 and 370 nm, excite fluorescence. Two wide-band detection channels (310–400 nm and 420–650 nm) are then used to detect three fluorescence bands which can finally be used to identify various particle types (e.g. Forde et al., 2019). The instrument functions with a sheath flow rate of 2.1 L/min of which 0.3 L/min is sampled. A more detailed description of the WIBS-4, upon which the WIBS-NEO is based, can be found elsewhere (Gabey et al., 2010; Healy et al., 2012).

For the WIBS-NEO measurements, particles were considered to be fluorescent if emission values in any of the three WIBS channels (FL1, FL2 or FL3) exceeded their predetermined baseline thresholds. This fluorescent threshold was calculated by placing the WIBS instrument into what is termed “Force Trigger” mode. This mode effectively causes the xenon flash lamps to fire directly on the optical chamber without particles present and is used to estimate the mean baseline fluorescence intensity in each channel. This value, plus 3 times the standard deviation in this mode, was then utilised to differentiate between fluorescent and non-fluorescent particles for ambient sampling. Fluorescent particles were then categorised utilising Perring nomenclature (Perring, 2015) and particles of the ABC type (i.e. particles which fluoresce in all channels) were further filtered by optical diameter (Dopt < 20 µm discarded) in an effort to target larger fluorescent biological particles which are potentially pollen. Such an approach has previously shown good results in comparison with manual measurements O’Connor et al., 2014a. Hourly values were obtained by summing the number of particles detected per hour and converting using the known volume of air sampled.

2.2.3 Hund-Wetzlar BAA-500

The Hund-Wetzlar BAA-500 essentially automatizes the current standard manual process, with air being pumped through the instrument and samples being collected on glass slides which are then analysed using an optical microscope and image analysis system. Eight focal positions are scanned through the vertical and usually approximately 25% of the slide’s surface area is sampled. Air is pumped through the device at a rate of 100 L/min, with the pump being switched on for one of every ten-minutes. The sampling mechanism excludes most smaller particles (aerodynamic diameter < 10 µm) using a virtual impactor to ensure the slides are kept as clean as possible to make later image analysis quicker. Live monitoring data can be labelled manually to improve the performance of the analysis software. The device is currently used operationally in the ePIN network in the German state of Bavaria and has been shown to be able to identify a number of pollen taxa and some spores (Oteros et al., 2020). The pollen concentrations for individual taxa were obtained from the commercial software installed on the device. Total pollen was obtained by summing all pollen taxa identified (see Table S2). This was the only device tested for which the entire system, instrument and software, were provided by the manufacturer. The BAA500 was located slightly more than 1 m away from the other samplers to avoid possible perturbation of the flow by the strong inlet suction. The inlet velocity at the opening of the Sigma 2 (used for all the other devices) is all smaller than 0.1 m/s, and thus unlikely to generate a significant airflow perturbation for the other instruments located approximately within 1 m of each other.

2.2.4 Plair rapid-E

The Plair Rapid-E is an airflow cytometer that uses a 405 nm laser to provide time-resolved scattering patterns that are detected by 24 detectors located at different angles (± 45°); the time resolution of the scattering pattern acquisitions is 1 µs. In addition to the scattering, a UV laser (337 nm) induces fluorescence, which is measured across 32 channels distributed within a spectral range of 350–800 nm (eight sequential acquisitions at 0.5 µs interval). Finally, fluorescence lifetime is recorded for four bands at nanosecond resolution. Air is sampled through the instrument at a rate of 2.8 L/min (Kiselev et al., 2011, 2013). The instrument and its predecessor (PA300) have been used to identify several pollen taxa (Crouzy et al., 2016; Sauliene et al., 2019). The Rapid-E has mainly been used for research purposes as well as for a small operational monitoring network between Serbia and Croatia (Tesendic et al., 2020). In this study, we use an algorithm similar to that described in Crouzy et al., 2016 to extract a time series of 1-h average total pollen concentrations. While the identification of individual pollen taxa requires the heavier machinery of supervised learning techniques (convolutional neural networks), a physically based approach is sufficient to distinguish pollen from other particles. To do this, we applied thresholds on particle size, which is estimated to be proportional to the square root of the total scattered light in the geometric regime (integral over time and over the 24 scattering angles). We then restricted the selection to particles with a bimodal-induced fluorescence spectrum, with thresholds on the intensity of the two modes and well-defined windows for their position (see Pöhlker et al., 2013). Hourly averages were obtained by aggregating the one-minute data obtained.

2.2.5 Swisens Poleno

The Swisens Poleno is an airflow cytometer that identifies pollen and other airborne particles using digital holography and fluorescence. A pump sucks air through the instrument at a rate of 40 L/min and a virtual impactor is used to concentrate particles that have aerodynamic diameters larger than 10 µm. Two holography cameras at 90° to each other take digital images at a resolution of 0.5 µm which are then used to reconstruct two in-focus images of each particle. A light-emitting diode is also used to excite fluorescence at 280 and 365 nm. To date, just the images have been used together with convolutional neural networks (CNN) to identify eight different pollen taxa (Sauvageat et al., 2020). The Swisens Poleno is currently being installed in Switzerland as part of the national pollen monitoring network. For the purposes of this study two prototype devices were tested, with several modifications to the measurements systems being carried out during the campaign. This led to a number of periods where data were not available. The pollen classification algorithm used was developed in-house and based on the two-step software described by Sauvageat et al. (2020). The first step distinguishes between pollen and non-pollen particles by applying an ellipse-fitting algorithm to the two holographic images obtained for each particle, with at least one of the two values obtained needing to meet a strict criterion and the second not deviating excessively from this value. All such elliptical particles are considered as pollen; this method was shown to have an overall accuracy of 96% (Sauvageat et al., 2020). The second step uses a convolutional neural network to classify pollen into different taxa, which were then summed to obtain total pollen. As for the Plair Rapid-E, the algorithm was not specifically designed specifically for total pollen but rather to identify a certain number of individual pollen taxa. Hourly values were obtained by summing the number of particles detected per hour and converting using the known volume of air sampled.

2.2.6 Yamatronics KH3000

The Yamatronics KH-3000 device uses a 780 nm laser to produce forward- and side-scattering signals of all particles that pass through the system at a flow rate of 4.1 L/min (Kawashima et al., 2007). The detected signals are used to estimate optical diameters and particle counts, as fully described in (Kawashima et al., 2007). The device has been used across the “Hanakosan” network for national pollen monitoring in Japan since 2002. It is a highly robust, low-cost instrument that functions effectively in the particular conditions relevant for Japan, where the dominant allergenic pollen are Cryptomeria japonica and Cupressaceae, both emitted in late winter before most other taxa. Although attempts have been made to distinguish between other pollen species in other regions, this has so far not been achieved with a high degree of accuracy (Kawashima et al., 2017). Two different devices were used, a pulse and regular KH-3000 (hereafter termed KH-3000-A and KH-3000-B, respectively). The pulse-type (KH-3000-A) instrument outputs sideward and forward scattering intensities in mV and an extraction window, which can be adapted, is then applied to these data to obtain pollen concentrations during analysis. The regular device (KH-3000-B) is used operationally and the extraction window is fixed in advance, with only the number of particles that fall into this window being reported. Hourly averages were obtained by aggregating the one-minute data obtained.

2.2.7 Statistical analyses

Data preparation was carried out in R (R Core Team, 2020) using Tidyverse-packages (Wickham et al., 2019), with data from the various devices obtained in raw form and converted into hourly or six-hourly total pollen concentrations, depending on the instrument. The hourly concentrations were then aggregated into 6-hourly and daily averages to investigate differences between devices. At least 3 out of 6 h or 12 out of 24 h needed to have data for an average to be calculated for the 6-hourly or daily averages, respectively. Malfunction or maintenance of all devices was logged and to ensure a fair evaluation of all instruments, all time periods with missing data, no matter the reason, were removed from the time series’ of all devices. In total, this meant that 23.8% of all hourly averages, 22.1% of all 6-hourly averages, and 18.6% of all daily averages were removed from the dataset (Table 1). Generally, each instrument in the analysis represents one measurement device, except for the manual measurements, where data from the two Hirst-type traps were averaged to obtain a more robust reference against which the other devices were compared. For the statistical analysis all time steps where the Hirst-average was below 10 pollen grains/m3 were excluded from the data set. Pearson and Spearman correlations were calculated using these data.

Table 1 Percentage of data available after processing, first for removal of missing data and plotting purposes, then for the statistical analyses where all timesteps where the Hirst average was < 10 pollen grains/m3 were removed

After an initial residual analysis, the measurements were converted into logarithmic concentrations for statistical comparison. Even the log-concentrations did not fulfil the assumption of standard statistical methods (i.e. assuming normality of errors with constant variance and mean zero). Hence, robust statistical methods were applied. The Kruskal–Wallis test (Kurskal & Wallis, 1952) is considered a rank-based omnibus test, evaluating whether the variance among the different instruments is greater than the unexplained variance (i.e. the variance within a data set from a particular device). If the resulting explained variance is low, one can assume that the devices are similar. This omnibus test was then followed up by multiple pairwise tests between the instruments. The pairwise comparison and simultaneous confidence interval for the estimated effects were calculated using the nparcomp-package (Konietschke et al., 2015) using the Dunnett method, where the Hirst-mean was chosen as the reference level. The resulting estimator can be interpreted as a proxy for the relative difference in median between two devices. If the estimator is > 0.5, then the second device tends to have higher values. The null hypothesis H0: p = 0.5 is assessed on an α = 5%-level. The lower and upper bounds denote the confidence interval of the estimator.

The full analysis code is freely available on Github: https://github.com/sadamov/trapcomparison.git

3 Results and discussion

3.1 Basic statistics

The number of instruments measuring pollen simultaneously for the 42-day study period varied depending on the temporal resolution. 18.6% of data were excluded because of missing values for the daily averages (Table 1) and this value increases to 23.8% for the hourly averages. Only a small percentage (2.3%) of daily values were further removed for the statistical analyses because the Hirst average fell below the cut-off value of 10 pollen grains/m3. Again, the percentage removed increased slightly for the hourly values, with an additional 13.6% being removed. This means that one or more of the instruments either did not observe pollen because they were being maintained, tests were being carried out, or sampling in the Hirst devices was insufficient to detect very low concentrations.

Several further basic statistical metrics were calculated for the study period (Table 2). Even for the daily averages, there are large differences across the instruments, with mean values ranging from 21 to 369 pollen grains/m3. Interestingly, the mean of the two Hirst instruments lies approximately in the middle of this range, at 96 pollen grains/m3 for the daily average. As expected, at higher temporal resolutions the mean values do not change considerably; however, the spread within measurements for each instrument, represented by the standard deviation (SD), increases. This is likely a result of the inherent noise in the data, or in other words, both because of the higher variability in actual pollen concentrations at the hourly level and because fewer values are taken into account to produce the averages at such higher temporal resolutions.

Table 2 Basic statistics for total pollen for each instrument and for the various temporal resolutions investigated

In terms of the correlation with the average of the two Hirst devices, the different instruments show a wide range of performances. The BAA500 has high values (Table 2) which agree well with previous work from Oteros et al. (2015), who found a correlation coefficient of 0.98 for daily average total pollen. The values found in this study for both the WIBS-NEO and Rapid-E are considerably lower than in other studies, with values of 0.97 (O’Connor et al., 2014b) and 0.96 (Crouzy et al., 2016), for previous versions of the WIBS and Rapid-E, respectively. It is unclear why there are such large differences for these two instruments compared to previous studies. In contrast, the KH-3000-B performs better than as shown by Kawashima et al. (2017), who found a correlation of 0.52 for daily average total pollen. The correlations for the two Poleno devices are likely low for two reasons: the first is that a smaller number of taxa are identified and thus probably some pollen are missed. The second, and most likely much more important reason, is the fact that the devices used during the 2019 campaign were still prototypes and underwent several changes throughout the campaign. This meant there are many gaps in the data and the instruments did not appear to perform as well as later versions of the device. No previous studies have been carried out comparing the Poleno device with manual Hirst counts, nor have any studies considered temporal resolutions below 24-h averages so no comparison can be made against previous work. Nevertheless, for all instruments the correlation coefficients drop markedly at higher temporal resolutions for all instruments. This may, however, be a result of the large error associated with Hirst counts at 6-hourly or hourly resolutions (Adamov et al., 2021).

3.2 Daily total pollen concentrations

As evident in the basic statistics, there is considerable variability between the instruments even if they all appear to capture the basic seasonal signal of both the tree (mainly April) and grass (mainly May) pollen seasons quite well (Fig. 1). The timing of the peak at the beginning of the study is captured by nearly all the devices that measured during this period, however, the two KH-3000 devices and the WIBS-NEO show peak values considerably larger than either of the Hirst instruments. Furthermore, the KH-3000-B indicates only one peak while the KH-3000-A shows a double peak with both peaks occurring later than the KH-3000-B. The WIBS-NEO shows a peak of similarly large magnitude but at least corresponds better in terms of the timing compared to the manual observations. The KH-3000-A also shows two anomalous peaks in the first two weeks of May, although it is not the only instrument to do so. Both Poleno devices have two large anomalous peaks in the latter half of May, although the difference compared to the Hirst observations is not as big as those shown by other instruments earlier in the season. Nevertheless, the majority of the devices agree relatively well with the Hirst traps for the rest of the season, although there is some divergence between them all and none match the manual observations very precisely.

The spread between instruments is further evident in Fig. 2. While the median daily average values are consistent with the Hirst measurements for four out of the seven automatic monitors, the KH-3000-B, Rapid-E, and WIBS-NEO show significantly lower median values. Robust statistical contrast tests indicate that overall these three instruments are significantly different from the average of the two Hirsts (Table 3). The KH-3000-B and the WIBS-NEO also exhibit somewhat less variability than the Hirst measurements, as seen in the smaller range covered by each of the boxes in Fig. 2. Interestingly, the two Polenos show quite different ranges of values, with the Poleno-4 presenting considerably higher values than the Poleno-1 or indeed compared to any other device. It is unclear what caused these differences; however, the instrument was still under development at the time of the campaign and there may have been differences between the two prototypes.

Fig. 2
figure 2

Box plot of logarithmic daily average total pollen counts for the period of study: 19 April–31 May 2019

Table 3 Nonparametric robust contrast test with confidence intervals for daily average observations

The differences between automatic and manual measurements appear to be related to the total pollen concentrations. For lower concentrations (20–50 pollen grains/m3), there are larger differences both in terms of the median values and the spread in differences compared to the Hirst (Fig. 3; note that because logarithmic values are taken all differences are positive). The largest differences from the Hirst are observed for the KH-3000-A and the two Polenos, with all three devices showing median differences larger than 100%. The other four devices show smaller differences from the Hirst, but only the KH-3000-B and BAA-500 have relatively low median differences of 20% and 30%, respectively; all the others are well above this value. The relative differences between the automatic instruments and the Hirst average are smaller for the higher concentration groups (Fig. 3), with less variability between the instruments as well. Nevertheless, differences compared to the manual measurements are still relatively large, between 50–100% for the three different concentration groups above 50 pollen grains/m3. Note that for the daily averages there are relatively few total pollen concentration values greater than 100 pollen grains/m3; therefore these results should be considered with some caution.

Fig. 3
figure 3

Box plots of daily average percentage differences from the Hirst mean. The total pollen counts are separated into concentration groups based on the Hirst-trap data (from top left to bottom right: 20–50, 50–100, 100–300, and > 300 pollen grains/m3, respectively)

The different size cut-offs and capacities of the identification algorithms potentially have an impact on the ability of each instrument to count pollen (and, although not discussed in this manuscript, also different pollen taxa for those able to do so). Nevertheless, the size cut-off for the majority of the instruments means that they should, at least in theory, be able to identify all pollen particles. The only exception is the WIBS-NEO for which a size cut-off of 20 microns was applied here. A comparison with a size-cut off of 10 microns in fact showed a worse result compared to the Hirst trap, likely indicating that other fluorescent particles were counted as pollen. Depending on the instrument, they may or may not count pollen fragments—those that potentially might are the WIBS-NEO and the Plair Rapid-E, although this is unlikely since the particle size is also taken into account in the pollen identification algorithm and most pollen fragments can be expected to be smaller than the pollen.

3.3 Sub-daily total pollen concentrations

The higher sampling rate of most of the automatic instruments means it is possible to obtain observations at temporal resolutions below the daily averages usually available from the manual measurements. Figure 4 shows six-hourly averages for a selected week at the beginning of the campaign to illustrate what these high temporal resolution data look like. For the first three days of this period nearly all instruments show similar values to the two Hirst devices; however, from April 22, when higher concentrations occur, there is much larger variability between devices. As for the daily averages, the two KH-3000 devices and the WIBS-NEO show anomalous peaks that do not correspond with the results from either of the Hirst devices, even if there are a few missing data from the hirst-2 device on 23 April. The differences between the two KH-3000 s are likely a result of the differences between the analysis algorithms applied to the data. There are also missing data from the two Polenos and the Rapid-E; however, these devices and the BAA-500 all perform relatively well compared to the Hirst devices.

Fig. 4
figure 4

Six-hourly average total pollen counts for a selected week (19–25 April 2019) of the period of study. The time series were scaled to match the mean of the Hirst measurements (i.e. ratio mean Hirst:mean device X) to better compare the pollen season peaks measured by each instrument

Considering the entire study period, there is notably more spread in the Hirst six-hourly values compared to the daily averages (Fig. 5), which is in large part to be expected as there is likely to be more naturally variability at the sub-daily scale (Chappuis et al., 2019) and also because the error related to manual measurements at this temporal resolution is considerably larger (Adamov et al., 2021). The same is to some extent true for the seven other devices although nearly all the automatic instruments show only somewhat more spread within each dataset. The higher sampling rate of most of these instruments means there is lower uncertainty at the six-hourly level, and even beyond this (see similar figures for hourly averages in the Supplementary Material). Nonetheless, the KH-3000-B, Rapid-E, and WIBS-NEO display lower median values compared to the Hirst, while the BAA-500 and Poleno-4 indicate higher values. On the other hand, the KH-3000-A as well as the Poleno-1 show very similar values to the Hirst. These results are in part confirmed by the robust contrast test (Table 4), which indicates that the KH-3000-A is the only instrument not significantly different from the Hirst average at this temporal resolution.

Fig. 5
figure 5

Box plot of logarithmic 6-hourly average total pollen counts for the study period from 19 April–31 May 2019

Table 4 Nonparametric robust contrast test with confidence intervals for 6-hourly average observations

When looking in more detail at the differences, it becomes clear that the largest differences across the set of instruments, as for the daily averages, occurs at lower concentrations (10–20 pollen grains/m3; Fig. 6). The BAA-500, KH-3000-A, and the two Polenos all show relative differences well above 100% compared to the manual Hirst measurements, while the KH-3000-B, Rapid-E, and the WIBS-NEO show somewhat smaller differences but nonetheless ones that are 50% or larger. For concentrations greater than 50 pollen grains/m3, the differences between the various instruments and the Hirst average are smaller but still over 50% in most cases (Fig. 6). The error inherent in the manual measurements may be what contributes to the somewhat random and larger differences at this time resolution, particularly at lower concentrations (Adamov et al., 2021). Note that automatic instruments also are subject to various sources of error, sometimes similar in nature to those of the manual counts, for example, uncertainties in the identification algorithm or sub-sampling because of issues related to the signals measured.

Fig. 6
figure 6

Box plots of 6-hourly average percentage differences from the Hirst mean. The total pollen counts are separated into concentration groups based on the Hirst-trap data (from top left to bottom right: 20–49, 50–99, 100–300, and > 300 pollen grains/m3, respectively). Note that the y-axis is on a log-scale, with a value of 1 indicating a 10 × larger value compared to the Hirst average

Hourly average observations show similar results for those instruments which have data available at this resolution (see the Supplementary Material).

4 Conclusions

This paper provides a brief evaluation of a range of automatic pollen monitors that were available on the market in 2019. This includes the Droplet Measurement Technologies WIBS-NEO, Helmut-Hund BAA-500, the Plair Rapid-E, two prototype Swisens Polenos, and two Yamatronics KH-3000 devices. The instruments were run in parallel from 19 April to 31 May and located in Payerne, Switzerland, representative of a semi-rural site on the Swiss plateau. The devices were validated against Hirst-type traps in terms of total pollen counts for daily and sub-daily averages. While the manual measurements cannot be considered a “gold standard” in terms of pollen measurements, they provide a well-known reference against which the other instruments can be evaluated. More accurate methods to validate the particle number counts of most automatic pollen monitors exist, however, such laboratory-based studies have so far only used artificial particles and not known concentrations of real pollen grains (Lieberherr et al., 2021).

This study is the first of its kind presenting results from the full range of instruments available in 2019. Overall, there was considerable spread between devices compared to the manual measurements. The monitors all compared better with manual measurements when daily averages were considered, with three of the seven not statistically significantly different from the manual measurements. However, when six-hourly averages were considered, only one of the instruments was found to not be statistically significantly different from the Hirst trap average. This is in part likely related to a decreased signal-to-noise ratio at higher temporal resolutions resulting in significant differences being reduced rather than differences because of instrument performance at higher temporal resolution. The largest differences between instruments were evident at low pollen concentrations (< 20 pollen grains/m3) for all temporal averaging periods. This may be a result of using the Hirst as a reference, since these measurements have been shown to have uncertainties of well over 100% at such concentrations (Adamov et al., 2021). It is likely that the automatic devices, which nearly all have higher sampling rates, produce more robust results for lower pollen concentrations (Chappuis et al., 2019; Crouzy et al., 2016). This, however, is only based on the assumption that they sample a larger volume of air and thus should be more accurate. Further studies are required to investigate this in more detail. Finally, differences between the instruments may also be due to their varying abilities to identify specific pollen taxa as well as the algorithms applied, which may not have been designed specifically for total pollen. This study focused only on total pollen since not all monitors are capable of distinguishing different pollen taxa (e.g. the KH-3000) or even initially designed to identify pollen at all (e.g. the WIBS). However, further intercomparison studies, particularly looking at individual pollen taxa, will enable more detailed investigations of these aspects and provide a more in-depth understanding of the differences between instruments.