1 Introduction

Understanding the potential of real-time pollen data is essential for planning future pollen monitoring networks. The Swiss federal authorities investigated the cost-benefit ratio of implementing real-time automatic pollen monitoring and the associated information systems (Swiss federal authorities 2017). This study showed that, for the case of Switzerland, the benefits associated with the implementation of an automatic pollen monitoring network far outweighed any costs it would incur. Progress is, however, still required to ensure that a robust and stable network that functions to an adequate level can be implemented. In previous work (Crouzy et al. 2016), we elaborated on four aspects that are important to evaluate potential automatic pollen-monitoring devices: reliability (1), identification capability (2), sampling/counting representativeness (3) and faculty to additionally operate as a general-purpose aerosol monitoring device for air quality (4).

In this regard, the Swiss Federal Office of Meteorology and Climatology MeteoSwiss has been testing automatic pollen monitoring systems intensively, motivated by the potential improvements such systems would provide to patient information systems. In addition to the BAA500 (Helmut Hund GmbH), various airflow cytometers, such as the Yamatronics KH-3000 (Kawashima et al. 2007), the WIBS (Perring et al. 2015; O’Connor et al. 2014; Healy et al. 2014; O’Connor et al. 2014; Fernandez-Rodriguez et al. 2018), the Plair PA300 (Crouzy et al. 2016), or more recently the Swisens Poleno, have been evaluated.

Airflow cytometers can be used to identify aerosols by measuring scattered light and by inducing particle fluorescence. The measurement is non-invasive and requires no moving mechanical components except for the pumps. As a result, airflow cytometers have proven very reliable [see for example Crouzy et al. (2016)]. Identification, however, requires advanced data analytics, and pollen identification is currently still restricted to a limited number of taxa. Based on a completely different approach, the BAA500 (Oteros et al. 2015) automatises the actions performed by human operators in a manual Hirst-type network (Hirst 1952; Galán et al. 2014), from sample collection to image identification. While data interpretation for quality assurance/control (denoted QA/QC hereafter) using this approach is more straightforward than with airflow cytometers, other constraints on operation result from the presence of moving parts and from the need to regularly change consumable slides.

In this paper, we go beyond mere device validation and focus on the real-time pollen data from two measurement campaigns. We first compare the representativeness of manual Hirst and automatic data and then describe daily patterns for pollen immission at the measuring site at different times in the season. The high time resolution of the observations, comparable with the typical timescale of certain weather processes, also made possible a statistical analysis of the link between pollen concentrations and meteorological parameters measured at the same location. Finally, with the perspective of improving pollen forecasts, we comment on the potential use of real-time hourly pollen data as input for numerical models.

2 Material and methods

The study site, the optical device and the reference counts are described extensively in our previous work on device validation (Crouzy et al. 2016). Here, we simply provide a brief overview of the most important aspects.

2.1 Study site [adapted from Crouzy et al. (2016)]

Measurements were performed on the roof of the two-storey MeteoSwiss building (height 7m, ground altitude 490 m and WGS84 coordinates \(46^{\circ }48'48''\)N, \(6^{\circ }56'35''\)E), located in a rural environment close to Payerne, Switzerland (Fig. 1). Two volumetric Hirst-type pollen traps operating in parallel at the site were used to provide manual pollen counts. One detector was used for reference and the other as backup in case anything went wrong with the primary device. Payerne was chosen as campaign location because a wide variety of pollen types can be measured there, the combinations of which are typical for what can be found on the Swiss Plateau, home to over two-thirds of the Swiss population.

Fig. 1
figure 1

Validation set-up with boxes for automatic detectors (panel A, 1 and 2) and volumetric Hirst-type samplers (panel A, 3 and 4); MeteoSwiss study site in Payerne, Switzerland (panel B); the arrow indicates the location of the pollen monitoring devices; Details of the Sigma-2 inlet used for the automatic counting device (panel C). Reproduced from Crouzy et al. (2016)

2.2 Description of the automatic measurement system [adapted from Crouzy et al. (2016)]

We used the first unit of the commercially available PA-300 detector produced by Plair SA (see Fig. 2). Although the device is not the focus of the present study, we provide a general description here; a comprehensive technical description can be found in Kiselev et al. (2011, 2013). Particles first pass through red laser beams (658 nm), and time-resolved scattering data are recorded by two photo-detectors. The scattering signal helps to characterise the optical size, shape and surface properties of the particles. A third laser beam in the UV range (337 nm) excites the particles, and the wavelength-resolved fluorescence signal is recorded using a diffraction grating and an array of 32 photo-detectors (32 equal bins with overall range 390–600 nm). In addition, the phosphorescence (long-time response) is recorded by the two photo-detectors used for the scattering signal and the short-time response by an ultrafast photo-detector. For a summary of the time resolution and wavelength detection range of the different detectors, see Crouzy et al. (2016).

Air is pumped into the detector at a flow rate of 2 \(l/{\hbox {min}}\). To increase the sampling rate, a concentrator based on the virtual impactor principle was used with the instrument. The concentrator was provided by Plair SA as an option to the PA-300. The major air outlet of the concentrator was connected to the second pump (flow rate 30 \(l/{\hbox {min}}\)) while the minor outlet was connected to the device. As a result, an effective sampling flow rate of 30 \(l/{\hbox {min}}\) is obtained for particles larger than \(10\,\mu m\). Finally, a Sigma-2 inlet (Verein Deutscher Ingenieure 2013) was put on top of the concentrator to protect the detector from rain (see panels A and B of Fig. 1).

Fig. 2
figure 2

Schematics of the Plair PA-300 detector (courtesy of Plair S.A.)

2.3 Manual counts [adapted from Crouzy et al. (2016)]

We used the current standard method in aerobiological networks (Galán et al. 2014; Hirst 1952) as a reference against which to compare the automatic pollen counts. Hirst-type volumetric samplers (we used a detector from Burkard Manufacturing Co.) collect airborne particles on a rotating drum, with efficient impaction for particle sizes larger than \(5\ \mu {m}\). The pollen counts were made at MeteoSwiss using the same procedure as for operational monitoring during the pollen season (optical microscope Olympus BX45 magnification \(600\times \)). Results were aggregated as average hourly and daily concentrations of 47 pollen types, plus one category for unidentified pollen. As for operational pollen measurements, counting all the grains present on the drum was not feasible for the entire duration of the pollen campaign and instead two lines per slide were counted (approximately 5% of the slide). Note that in this regard, the operational practice differs from (Galán et al. 2014). For the average daily total pollen concentration, comparison from a preceding campaign among three Hirst samplers running in parallel shows a relative fluctuation over the season of around 30% on average.

Two Hirst samplers were run in parallel to ensure completeness of the reference data throughout the season (See Fig. 1). The reference Hirst device suffered two failures, resulting in the loss of fourteen days of data. Out of those missing days, twelve could be replaced using data from the second Hirst sampler running in parallel.

2.4 Pollen identification

The 2015 validation of the algorithms presented in (Crouzy et al. 2016) was extended by repeating the comparison exercise in 2016. This provided information on device stability and better validation. To improve discrimination, we performed 25 additional calibrations. Fresh pollen was collected and inserted into the device by blowing small quantities into the Sigma-2 inlet within a few hours of collection. The following species were calibrated: Alnus glutinosa, Betula pendula, Carpinus betulus, Corylus avellana, Cupressus sempervirens, Dactylis glomerata, Fagus sylvatica, Fraxinus excelsior, Phleum pratense, Pinus sylvestris, Plantago lanceolata, Populus alba, Quercus robur, Taxus baccata and Ulmus glabra. The events generated in the detector were labelled according to pollen taxa and used to perform a finer training using the algorithms developed in Crouzy et al. (2016). Pollen concentrations were also obtained using the algorithms introduced in Crouzy et al. (2016). For discriminating pollen from other aerosols, thresholds on optical size and fluorescence were applied; however, the identification of single-pollen taxa required more complex techniques. The best results were obtained by combining predictions from a support vector machine classifier (SVM) and an artificial neural network (for more details see Crouzy et al. (2016)). It is important to note that while a retraining was performed, the overall algorithms remain unchanged.

2.5 Meteorological data

An operational station of SwissMetNet, the MeteoSwiss operational automatic weather monitoring network, provided meteorological parameters used to investigate the relationship between pollen immission and local weather conditions. The SwissMetNet station is located on the meadow bordering the MeteoSwiss study site (Fig. 1). The distance between pollen measurements and the weather station is approximately 200 metres. The data were subjected to manual and automatic quality control procedures.

3 Results

3.1 Pollen identification and counting: daily data

The extended detection algorithm described above was used to estimate the time series of three pollen taxa (grass, beech and pine pollen) as well as of total pollen from the automatic real-time pollen measurements. Figure 3 shows daily data from the 2016 pollen season for total pollen as well as the three individual pollen taxa. Note that although Pinus sylvestris was calibrated, the monitoring results should be understood as a proxy for Pinus spp. Similarly, Dactylis glomerata calibrations were merged with Phleum pratense calibrations and used as a proxy for Poaceae. In general, there is reasonable agreement between the automatic and manual observations on most days. For the three taxa as well as for total pollen, there are certain days where either the manual or automatic measurements show higher values, but given the uncertainty in both of these observations, it is difficult to establish which, if any, of the two is correct. Repeating the analysis performed in Crouzy et al. (2016) is not the purpose of this paper, nevertheless there is an improvement from Crouzy et al. (2016) who showed results for just total pollen and grasses. Such development is essential given that pollinosis sufferers require information about specific taxa.

Single-pollen taxa are typically present in the air only for short periods of time; thus time series analysis on the airborne concentration of individual pollen taxa requires a data record covering more than just the two years available for this paper. In addition, identification algorithms introduce a further source of uncertainty: single-pollen classification is run after pollen vs other aerosols identification. We therefore focus just on total pollen for the rest of the paper, the state of the art for pollen identification is discussed in detail elsewhere (Šauliene et al. 2019).

Fig. 3
figure 3

Average daily pollen concentrations for total pollen, grass pollen, Fagus sylvatica pollen and Pinus spp. pollen, recorded by two different devices (Hirst or Manual and Plair or Automatic) during the 2016 pollen season

3.2 Pollen identification and counting: hourly data

To look in more depth at the real-time observations, hourly data from a few selected days from both the 2015 and 2016 pollen seasons are shown in Fig. 4. While these data provide just a snapshot for certain days, they highlight two main issues relevant at the daily scale. Firstly, the daily mean value (grey bars) gives very little indication of the actual fluctuation of pollen concentrations at the sub-daily scale, providing strong evidence for important hourly information to be delivered to allergy sufferers, or used for studies on the interplay between weather and pollen immission. Secondly, when pollen levels are relatively high (Fig. 4 middle and lower panels), the automatic and manual measurements agree, as for the daily average values, remarkably well. However, when pollen concentrations are low (Fig. 4 upper panels), only the automatic system is able to provide data which show a coherent daily cycle. The manual observations on the other hand present a noisy signal with erratic jumps. This suggests that not only is the automatic system capable of providing hourly measurements, but its higher sampling rate makes such observations much more certain.

Fig. 4
figure 4

Hourly values of total pollen concentrations recorded by two different devices (Hirst or manual and Plair or automatic) for six selected days. Note the grey bar in the background which denotes the daily average value obtained from the manual measurements

This is further demonstrated in Fig. 5, which shows how the correlation between manual and automatic measurements increases as data are aggregated over an increasing number of hours (moving towards the right in each graph). The correlation increases significantly as one aggregates data at the daily scale, from around 20 hours onwards. Note that the correlations beyond approximately 100 hours are less meaningful since for the periods considered (either April–May or just June) these correlations are computed for a small number of data points. Also to note is the fact that the correlations are considerably lower for the April-May 2016 period; a result of the fact that a large number of pollen calibrations for training the recognition algorithm were carried out during this period, resulting in a perturbation of the automatic system (interruption of data collection and pollution of the measuring system, hence the choice to present separate periods). The perturbation of the system by calibrations was more extensive in 2016 than in 2015 (19 vs 14 calibrations). As discussed in Crouzy et al. (2016), the Pearson correlation provides an over-optimistic measure of the performance of the automatic system since it mostly reflects the peak values of pollen immission; significant pollinosis symptoms already appear for moderate exposition values. On the other hand, the Kendall tau penalizes mismatches in the ordering of points of the time series regardless of the exposition level (even for extremely low concentrations). Both indicators therefore need to be considered with some caution, by examining the details of the time series and identifying eventual pitfalls.

Fig. 5
figure 5

Pearson correlation coefficient (blue crosses) and Kendall tau (orange triangles) between manual and automatic measurements aggregated over a range of hours for different periods of the 2015 and 2016 pollen seasons

3.3 Interplay between airborne pollen and weather parameters

Throughout this paper, pollen immission values are discussed, indicating ground level airborne pollen concentration relevant for human exposure. These concentrations are the result of pollen release, transport and deposition. The high temporal resolution of the automatic system opens up a wide range of new research avenues. The sub-daily scale data provide, for example, the possibility to look in more detail at the relationship between airborne pollen and meteorological parameters, thus helping to derive a better understanding of the processes controlling airborne pollen concentrations and potentially improving numerical models and pollen forecasts. Figure 6 provides such an example for the month of May 2016. The relative humidity and temperature (top panel) clearly correlate with the hourly pollen concentrations (lower panel), the former showing a negative relationship but the latter a positive correlation. No such relationship is obvious considering just the daily mean values (grey bars, bottom panel). As mentioned above, the daily mean values, which are regularly reported around the world, are in general poor indicators of the actual variability of airborne pollen concentrations.

Fig. 6
figure 6

Hourly weather and total pollen parameters measured over May 2016. Top panel: relative humidity (%) and 2m temperature (\(^{\circ }\)C). Bottom panel: total pollen concentration (Pollen/m3)

A simple analysis of the hourly automatic data shows that for both the 2015 and 2016 pollen seasons there are significant correlations between hourly pollen concentrations and measurements of precipitation, solar radiation, 2 m temperature and relative humidity (see Table 1). The latter two variables show the strongest correlations, particularly for the month of May in both 2015 and 2016. Consequently, highly correlated (or anti-correlated in the case of relative humidity) daily cycles between the airborne pollen concentrations and these meteorological variables are most obvious in May. Considering the fact that grass pollen was dominant when the correlation was the highest, we can infer that the release, transport and deposition of grass pollen are most sensitive to these meteorological parameters, in particular during the early stages of the grass pollen season. Correlations are most evident when the analysis is applied to hourly data (probably an effect of the much larger dataset obtained when considering hourly data), time series aggregated to a daily resolution do not present the same significance of correlations (not shown).

Table 1 Pearson correlation coefficient between hourly total pollen concentration and following meteorological parameters: precipitation, 2 m temperature, sunshine duration, relative humidity, vapour pressure and wind speed

4 Discussion

While there is still considerable room for development, the potential of high temporal resolution automatic pollen measurements is clear. Data at hourly resolution show, for total pollen, an obvious diurnal cycle which is not captured in real time and with the same sampling by the daily-average manual measurements typically made. This sub-daily variability has consequences, for example, for the treatment of allergy patients who usually present symptoms above a particular threshold of pollen concentration. This is seen in Fig. 7 which shows that a far greater percentage of days have hours with high levels of pollen when compared to just the percentage of days with daily mean values above that particular level. Even when averaging over several hours, for example, mornings (yellow dots) or afternoons (pink crosses), this signal is present for a higher number of days.

Fig. 7
figure 7

Percentage of days with airborne pollen concentrations above particular thresholds (as shown in the legend) for 2015 (left panel) and 2016 (right panel)

These high temporal resolution data need, however, to be delivered as close to real time as possible. When the hourly pollen data are autocorrelated over various lags, it can be seen that the memory of the system drops off very rapidly (Fig. 8). Autocorrelations fall to near zero after just 3-4 hours, regardless of whether hourly automatic or manual data are used. Beyond this time horizon, pollen information systems cannot be purely measurement-based and need to resort to forecasts. While expert forecasts or statistical models, see, e.g. Hilaire et al. (2012) still play an important role, pollen transport has also been introduced into operational numerical weather models (Pasken and Pietrowicz 2005; Schueler et al. 2006; Sofiev et al. 2006; Vogel et al. 2008; Sofiev 2017; Zink et al. 2012). Numerical weather models offer high resolution in time and space and the quality of the output does not depend on the skill of an individual forecaster. It is, however, difficult to quantify the quality of the numerical model output due to the low spatial resolution of pollen monitoring networks (Buters et al. 2018), and because of the mismatch between the daily resolution of manual pollen measurements and the sub-hourly resolution of numerical models. Another weakness of models lies in their input; precise vegetation maps and models for pollen emission (timing and quantity) are essential (Schuck et al. 2002; Masson et al. 2003; Ulmer 2006; Zink et al. 2017; Pauling et al. 2012). The variability of biological processes is, however, large, and emission maps remain imprecise. The cost of online pollen monitoring devices currently means a high-density observational network is not foreseeable, and direct assimilation of measurements into numerical models is not, at present, feasible. However, the real-time availability of data offers the possibility to constrain emission models by correcting them in real time, increasing or decreasing emissions, or adjusting the timing of the start of the pollen season for individual taxa.

Fig. 8
figure 8

Autocorrelation of the hourly total-pollen time series

5 Conclusion

In this paper, we present a brief analysis of two seasons of real-time hourly pollen measurements performed at MeteoSwiss (spring to early summer 2015 and 2016). In contrast to previous studies on automatic pollen monitoring, the focus is not on the measurement technology but on the real-time data itself. The time series obtained provide the possibility to carry out statistical analyses to estimate airborne pollen concentrations for three individual pollen species as well as total pollen at hourly resolution for both seasons. When compared to reference manual measurements, the automatic system performs well, showing considerably better results particularly for days with average pollen concentrations less than 100 grains per cubic metre; this is a result of the higher sampling rate of the automatic device. Daily mean values are neither representative of the daily maximum values attained nor do they correlate well with meteorological parameters at this time scale. We also show that the hourly data need to be delivered rapidly, at least to improve measurement-based pollen forecasts, since the sub-daily variability is high and there is little memory in the system. Finally, we also present a brief analysis of the possible avenues that could be explored relating hourly pollen concentrations with meteorological parameters. Results show that the airborne total pollen concentrations were most sensitive to fluctuations in temperature and relative humidity, particularly in May with the start of the grass pollen season.

Continuous hourly observations made over entire seasons provide the possibility to investigate in depth the relationship between patient exposure, symptoms occurrence and actual airborne pollen concentrations. Such research could in turn be used to better define critical exposure levels and allow patients to take appropriate actions in terms of activity planning and the use of medication. The high sampling rates of automatic systems mean that statistically robust conclusions can be drawn, thus opening up new research directions in the fields of aerobiology and allergology.

While the automatic pollen monitoring system presented here provides promising results, the algorithms used to estimate the concentrations of individual pollen taxa need to be extended to include other important allergenic species to fulfil the needs of allergic patients and their doctors (Šauliene et al. 2019). Assuming this requirement can be satisfied, maximizing the impact of these measurements requires considering the whole product chain going from the measurements, via forecasts and models, to communicating information to doctors and patients.