1 Introduction

A recent review of allergenic pollen highlighted twelve pollen types: Ambrosia, Alnus, Artemisia, Betula, Amaranthaceae (includes former goosefoot family Chenopodiaceae), Corylus, Cupressaceae/Taxaceae, Olea, Platanus, Poaceae, Quercus and Parietaria/Urtica for their allergenic potency and abundance in the atmosphere (Skjøth et al. 2013). The majority of pollen sensitisations in Europe are caused by Betula and Poaceae (Bousquet et al. 2007). Ambrosia is the second most important cause of seasonal asthma and rhinitis in many areas of its native distribution range (i.e. North America), and in the past decade, its clinical relevance has increased notably throughout Europe (Smith et al. 2013 and references therein). It is estimated that the number of allergic people in Europe will more than double by 2060 (Lake et al. 2017). Hence, the ability to predict the variability of daily pollen concentrations for the most important allergenic pollen would be beneficial for a great number of pollen-sensitive individuals.

Avoidance of airborne pollen is an important part of allergic symptoms management (Peden and Reed 2010). In order to be effective, avoidance requires temporally resolved forecasts on the quantity of the airborne pollen present in the atmosphere over the year. Forecasting in aerobiology is commonly based on either observation-oriented models or source-oriented models (Scheifinger et al. 2013; references therein). The former relate pollen records to one or more measurable variables (Norris-Hill 1995), while the latter use mathematical formulae of atmospheric transport and diffusion to calculate concentrations at various distances from a known source (Skjøth et al. 2010). Despite the advantages of both modelling approaches, their implementation requires a significant amount of data for calibration and running (i.e. pollen concentrations and meteorology); substantial information about the location, abundance and emission characteristics of the pollen source (Skjøth et al. 2010; Pauling et al. 2012); and significant computational resources. As a result, none of these models are available in many regions.

The emission of allergenic pollen from its source is directly linked to seasonality of the flowering phenophase (Dahl et al. 2013), which makes pollen calendars the simplest observation-oriented approach for predicting the time of occurrence of airborne pollen in a given area (Scheifinger et al. 2013). Although pollen calendars are commonly used all over the world to depict seasonal distribution of pollen in the atmosphere (e.g. El-Ghazaly and Fawzy 1988; Cadman 1990; O’Rourke 1990; Kok Ong et al. 1995; Kaya and Aras 2014; Jae-Won et al. 2012; Piotrowska-Weryszko and Weryszko-Chmielewska 2014; Martínez-Bracero et al. 2015; Calderón-Ezquerro et al. 2016), most of the calendars are produced using Spieksma’s model (Spieksma and Wahl 1991). However, the 10-day temporal resolution of this model limits its application in predicting day-to-day variability of pollen concentrations.

The aims of this study are to build calendars for airborne Ambrosia, Betula and Poaceae pollen and to explore their performance in forecasting daily concentrations of these important allergens over the course of the season. The study also tests the suitability of different interpolation methods for completing short missing datasets (up to 7 days) that commonly occur in routine pollen monitoring using Hirst-type samplers (e.g. electric power loss, clock device stoppage).

2 Materials and methods

2.1 Airborne pollen data

Daily average concentrations of airborne Betula, Poaceae and Ambrosia pollen were collected in three Serbian cities: Novi Sad (45.25°N, 19.85°E), Sombor (45.77°N, 19.11°E) and Niš (43.32°N, 21.90°E) using the 7-day volumetric spore trap of the Hirst design (Hirst 1952). It was positioned above the local pollen sources to ensure regional representativeness of obtained data. The collected samples were prepared and analysed to retrieve daily average pollen concentrations expressed as particles per cubic metre of air (P m−3) following European Aerobiology Society’s minimum recommendations (Galán et al. 2014). The main part of the research was conducted using the data from Novi Sad that was collected between 2000 and 2015. This dataset was the largest and the most representative, since it included both seasons with high and low pollen records (Table 1). The datasets from Sombor and Niš were collected from 2009 to 2015 and from 2007 to 2011, respectively. There were at least 5 years in both of them, which exceeded the minimum number required for calibrating a calendar model. These two datasets were used for exploring the performance of the models in other locations.

Table 1 Characteristics of the principal pollination period of Betula, Poaceae and Ambrosia in Novi Sad

Novi Sad is situated on the Pannonian Plain, one of the centres of European distribution of Ambrosia artemisiifolia L. (Smith et al. 2013), and is therefore surrounded by abundant sources of ragweed pollen. Conversely, local sources of Betula airborne pollen are limited to trees planted as ornamentals in the streets and parks, thus making the aerobiological situation notably dependent from atmospheric transport of birch pollen from distant sources. There are no large grasslands in the study region, and most of the Poaceae pollen sources are limited to cereal crop fields and public and private green areas, which are heavily managed. Conditions in Sombor are very similar, as it is also located on the Pannonian Plain, less than 100 km from Novi Sad, while Niš is situated in south-eastern Serbia in the Nišava river valley. It is surrounded by hills, and thus, vegetation, pollen sources and pollen transmission are different to those observed on the Pannonian Plain.

There is a distinct seasonality in occurrence of airborne Betula, Poaceae and Ambrosia pollen on the Pannonian Plain (Radišić 2013), which results from seasonality in flowering phenology of pollen sources. In order to avoid overestimation of correlation coefficients in long all-zero sequences of the dataset, the analysis focused on principal pollination periods of Betula, Poaceae and Ambrosia in the study region. The limits of these periods were determined as calendar months in which pollen is not accidentally recorded (i.e. at least for 1 day in 16-year average > 1 P m−3 is recorded): (1) for Betula March–May, (2) for Poaceae April–October, (3) for Ambrosia July–September. In comparison with commonly used percentage methods (Emberlin et al. 1997; Pathirane 1975), our approach includes long tails with low and intermittent records at the beginning and the end of the flowering period, which allows for testing models both during peak flowering period and outside of it.

2.2 Data analysis

Data analysis was performed in R (R core Team 2016), and charts were produced in Microsoft Excel and Mathworks MATLAB. The R Code along with instructions on how to use it for calibrating and testing calendar models is given in Online Resources 1–3.

2.2.1 Interpolation of missing data

Hirst-type samplers were used to continuously sample airborne particles until the drum has to be changed. Most devices in current operation (i.e. Lanzoni “VPPS2000” and “VPPS 2010,” Burkard ltd., Burkard Scientific) allow for uninterrupted sampling for up to 7 days. This was convenient for long-term monitoring with unattended sampling, but could result in 7-day data gaps if the device stops working (i.e. due to electricity issues, issues with clock device that rotates the drum or due to clogged sampling orifice).

Missing values in the calibration dataset could affect the performance of the constructed prediction model. For example, when producing the calendar model, missing values will result in a decrease in the number of calibration years and therefore limit robustness of the model. In order to be able to predict pollen concentration for any day with the highest possible accuracy, it was decided to interpolate missing data in the calibration dataset. However, missing data in the validation dataset were not interpolated in order to avoid comparison of modelled to interpolated data.

Three common methods available in R Zoo package (Zeileis and Grothendieck 2005) were tested: (1) last-observation-carried-forward interpolation, (2) linear interpolation and (3) cubic spline interpolation. All of the possible 7-day sequences, for which the measured data were available, were consecutively declared as a gap week. After the missing values were interpolated, the performance of each method was evaluated by comparing the calculated and measured pollen concentrations by using the same indicators that were used for comparing predicted and measured concentrations. For all analysed pollen types, linear interpolation had the highest correlation coefficient, while NRMSE (normalised root-mean-square error) was the lowest (Table 2) and thus it was selected as the method of choice for the interpolation of the missing pollen data in the calibration dataset.

Table 2 Performance of interpolation of 7-day data gaps using last-observation-carried-forward (LOCF), linear interpolation and cubic spline measured by average normalised root-mean-square error (NRMSE) and average Spearman’s rank correlation coefficient (ρ)

2.2.2 Calendar model

Standard calendar models predict the pollen concentration based on observed pollen concentrations on the same day in the calibration dataset. By calculating the mean or the median of available historic concentrations on the same day of the year, the model takes into account seasonal variability of the pollen concentration signal.

Standard calendar models typically follow Spieksma’s model which presents concentrations as 10-day means (Spieksma and Wahl 1991). This approach overcomes season-specific short-term shifts in pollen occurrence, but hides day-to-day variability in pollen concentrations and so has limited potential for allergy management. In order to limit the influence of short-term shifts while keeping day-to-day variability in pollen signal, advanced calendar models were developed, in which the concentration for a given day was calculated as the mean or the median using signals that were pre-processed by either moving average or moving median. This pre-processing resulted in four types of calendar models: mean of moving mean signals (MNMN), mean of moving median signals (MNMD), median of moving mean signals (MDMN) and median of moving median signals (MDMD). Different interval lengths (1, 2, …, 30 days) for calculating moving means/medians were tested in order to choose the most appropriate value resulting in the best performing calendar model. Odd window lengths are standard for moving mean and median filters, as they take the concentrations on the chosen date ± n days (n ∈ N 0 ). However, in R, they are defined for even numbers as well. In that case, the number of days in advance is for one greater than the number of previous days.

The relationship between the number of available years for the calibration dataset and the model performance for both standard calendar model and the best advanced calendar model were also analysed.

2.2.3 Measures of model performance

The performance of the calendar models developed in this study was assessed by comparing calculated and measured pollen concentrations. Correlation coefficients were used to analyse the strength of the relationship between two signals (i.e. simultaneous increase/decrease). The Shapiro–Wilk test (Razali and Wah 2011) indicated that daily pollen concentrations were not normally distributed and therefore Spearman’s rank correlation coefficient (ρ) was calculated, which indicates how well the ranking order of predicted and measured concentrations correspond.

In order to provide comparable assessments of model performance with respect to predicting the magnitude of pollen concentrations, the root-mean-square error (RMSE) was calculated, which is a commonly used measure for comparing the predicted and measured values in aerobiology (Astray et al. 2010; Makra and Matyasovszky 2011; Kasprzyk 2009; Csépe et al. 2014), but also in other areas such as agriculture (Marko et al. 2016; Bornn and Zidek 2012), geosciences (Chai et al. 2009; McKeen et al. 2005) and telecommunications (Altiparmak et al. 2009). RMSE overestimates large deviations due to squaring. However, the context of the season is not accounted for. RMSE of 50 P m−3 may be rather insignificant for seasons with huge amounts of pollen, but it can also be very significant in case pollen concentrations were small during the season. This could be misleading when comparing the model performance for pollen types with notably different abundance (like Poaceae with Ambrosia or Betula in the study) and also for the same pollen type that shows notable seasonal variability (like Betula in the study). Therefore, for assessing the accuracy of the model to predict the magnitude of daily pollen concentrations we followed the suggestion by Makra et al. (2011) and normalised root-mean-square error by the mean concentration in the analysed year (NRMSE).

Leave-one-year-out cross-validation procedure was applied to test the performance of calendar models. Each year from the dataset was chosen at one point as the test dataset, while the remaining 15 years were taken as the calibration dataset. This procedure was repeated 16 times, so that each year would serve as the test year for which Spearman’s ρ and NRMSE were calculated. In order to get a single measure of model performance, average of obtained Spearman’s ρ and NRMSE was calculated. Independent-samples Kruskal–Wallis test was conducted to explore whether there is a difference in distribution of Spearman’s ρ and NRMSE between different calendar models.

3 Results

There was notable seasonal variability in the amounts of recorded pollen (Table 1). The lowest value was recorded in 2000 for all analysed pollen types. The year 2001 was the one with the highest concentrations of both Poaceae and Ambrosia pollen and was the second lowest for Betula pollen. In general, Betula and Ambrosia pollen were more abundant than Poaceae pollen in the atmosphere of Novi Sad. Above-average seasons of Betula and Poaceae pollen were better represented in the analysed 16-year dataset. The most distinctive pattern in season-to-season variability was observed for Betula pollen where both sums of concentrations and average concentration showed biannual variability starting from 2006, with high values in even and low values in odd years. Time series of measured pollen concentrations revealed notable variability in magnitude and temporal distribution of peaks, which is the most pronounced in Betula (Supplementary material Figure S1), followed by Poaceae (Supplementary material Figure S2) and Ambrosia (Supplementary material Figure S3). Betula and Ambrosia pollen signals resemble Gaussian shape, with most of the pollen recorded during about one-third of the analysed period. On the other hand, the Poaceae season was longer, with notable amounts of airborne Poaceae pollen recorded during the entire analysed period. The only exception was a distinctive peak at the very beginning of the season.

Statistically significant positive correlations (p < 0.05) between measured daily pollen concentrations and values calculated by standard calendar model were observed for all pollen types except for Betula in 2013 (Table 3). On average, the highest intensity of correlation was measured for Ambrosia airborne pollen (ρ = 0.91) followed by Poaceae (ρ = 0.78) and Betula (ρ = 0.63). In regard to predicting the magnitude of pollen concentrations, a major discrepancy for all pollen types was observed in 2000, when the overall pollen concentrations were rather low. Over the entire analysed period, the highest average NRMSE was calculated for Betula pollen (NRMSE = 3.96) followed by Ambrosia (NRMSE = 1.38) and Poaceae (NRMSE = 1.26). Using median instead of mean for producing standard calendar model did not notably influence average Spearman’s ρ, but it did lower NRMSE of Betula pollen forecast (NRMSE = 2.77). Conversely, the difference was not so pronounced for Poaceae (NRMSE = 1.21) and Ambrosia (NRMSE = 1.48) pollen forecasts.

Table 3 Performance of the standard and the best advanced calendar model when compared to measured pollen concentrations of Betula, Poaceae and Ambrosia airborne pollen in Novi Sad

Pre-processing the calibration dataset with moving mean or moving median did not significantly influence the Spearman’s ρ (Table 3). For the best advanced calendar model, the highest increase (approximately 10%) was observed when predicting Betula pollen (ρ = 0.69), while the values for Poaceae and Ambrosia remained unchanged at 0.78 and 0.91, respectively. The largest observed effect was in the reduction in NRMSE (Table 3). The 20-day MDMD model performed the best in predicting Betula pollen concentrations, while 18- and 11-day MDMN models performed best in predicting Poaceae and Ambrosia pollen concentrations, respectively (Fig. 1). In addition to the decrease in average NRMSE, advanced calendar models tended to under-represent measured concentrations in season’s peak (Supplementary material Figures S1, S2, S3).

Fig. 1
figure 1

Normalised root-mean-square error (NRMSE) of the advanced calendar models calibrated on 15-year dataset in relation to number of days used for calculating moving mean or moving median: a Betula, b Poaceae and c Ambrosia airborne pollen. Values for 1 day correspond to standard calendar model. Values for 1 day correspond to standard mean and standard median calendar model

As the number of calibration years increases, the performance of the models improves, with respect to both Spearman’s ρ and NRMSE. The critical number of years that produces the largest improvement (about 30% comparing to the single calibration year) is four. Including additional years above this value produces less pronounced improvements in model performance, in terms of both correlation coefficient and prediction of the magnitude (Fig. 2).

Fig. 2
figure 2

Performance of calendar models depending on number of years available for training (measured by NRMSE normalised root-mean-square error and SCC Spearman’s correlation coefficient) in predicting average daily concentrations of Betula (a, b), Poaceae (cd) and Ambrosia (ef) airborne pollen. Pre-filtering was performed by using previously determined optimal window length. Optimal window size for MDMN in Betula pollen prediction was 1. This makes optimal MDMN model equivalent to median, so it was omitted in graph b

The results of independent-samples Kruskal–Wallis test showed that there was not a statistically significant difference in average Spearman’s ρ (Fig. 3a–c) between calendar models developed for forecasting daily concentrations of airborne Betula (p = 0.614), Poaceae (p = 0.944) and Ambrosia (p = 0.829) pollen. Also, the improvement in NRMSE (Fig. 3d–f) when forecasting daily concentrations of airborne Betula (p = 0.353), Poaceae (p = 0.669) and Ambrosia (p = 0.759) pollen was not statistically significant.

Fig. 3
figure 3

Performance of the standard and the best advanced calendar model when compared to measured pollen concentrations of Betula, Poaceae and Ambrosia airborne pollen in Novi Sad. SC is the standard calendar (mean) model, SC-10 is the mean model at 10-day temporal resolution (Spieksma and Wahl 1991), and AC is the advanced calendar model. Upper and lower bounds of boxes are the upper and lower quartiles, the line in the middle is the median, the cross stands for the mean value, and the vertical line spans between minimum and maximum values. AC*—median from 20-day moving median. AC**—median from 18-day moving mean. AC***—median from 11-day moving mean

Calendar models for Niš and Sombor yielded very similar values of NRMSE and Spearman correlation coefficient to those achieved in Novi Sad (Table 4). Similarly to the Novi Sad data, pre-processing with moving mean window proved to be optimal in terms of NRMSE, but using the mean of previous years performed better than taking the median, due to the smaller number of years in the calibration dataset. Although the optimal window sizes are not the same in three different locations, it should be noted that their order was preserved. The optimal window size for Betula was the largest, followed by Poaceae and Ambrosia, mainly because of the variations in the beginning and the end of their pollen seasons. These variations were most pronounced in Betula, followed by Poaceae, and lastly Ambrosia, with the latter having the most stable season boundaries of the three analysed pollen types.

Table 4 Performance of calendar models in Niš and Sombor

4 Discussion

4.1 Characteristics of Betula, Poaceae and Ambrosia pollen

The levels of Betula pollen measured in Novi Sad were well below the corresponding measurements in Northern Europe, but still notably higher than in the Mediterranean region (Skjøth et al. 2013). The biannual alternation of high and low seasons reflects the known biannual distribution of mast years in birch (Dahl et al. 2013 and references therein). As in many European regions, grass pollen season in Novi Sad results from flowering of different grass species and is therefore longer than Betula and Ambrosia seasons. Due to notable contribution of wheat crops in the region, the highest concentrations of Poaceae pollen were recorded in May and June. Novi Sad is situated on the Pannonian Plain, which is in the continental climatic zone (Lalic et al. 2011). Cold winters limit grass pollen occurrence to spring and summer, but at the same time, hot summers limit the amounts of pollen released, which is clearly observable in pollen indices measured in Mediterranean Europe (e.g. Spain and Greece) (Skjøth et al. 2013). The Pannonian Plain is one of the centres of European distribution of Ambrosia artemisiifolia L. (Smith et al. 2013). Therefore, Ambrosia pollen is regularly recorded in the air from July to October in the amount that exceeds values recorded in other parts of Europe (Skjøth et al. 2013).

4.2 Interpolation

There is a trade-off between the benefits of interpolating missing datasets and the issues that must be accommodated. Interpolation of intermittent missing values can be useful, as it improves the training dataset, thus making the model more robust. However, in regions where the infrequent transportation of pollen from distant sources causes sudden peaks, interpolation will not be able to adequately capture such changes. The tests confirmed that the last-observation-carried-forward method, which repeats the latest observation without any attempt to follow the signal’s shape, is not appropriate for the interpolation of data gaps in the pollen concentration signal characterised by magnitude changes (e.g. sudden decreases during peak flowering due to rain). The second interpolation method taken into consideration was cubic spline method. While it is possible to achieve a smooth and accurate approximation of a given signal, the performance was poor when interpolating short gap intervals due to the exhibition of over-oscillatory behaviour (Surhone et al. 2010). The described tests indicated that linear interpolation was the most suitable among the examined approaches for completing short data gaps in the aerobiological dataset. Linear interpolation has been previously used for dealing with data gaps in aerobiology (Hilaire et al. 2012) and also to complete short gaps in meteorological datasets (Prank et al. 2013).

4.3 Performance of pollen calendars

Pollen calendars are commonly used in aerobiology as graphical representations of airborne pollen distribution in the study area and are used for comparing the shape of the seasons. Their ability to enable season comparison is limited by the methodology applied for their development. Pre-processing of the dataset affects the shape of the pollen signal, and thus, comparison of season intensity by using advanced calendar models is possible only if the same window length is applied. For calendars presented as 10-day means (Spieksma and Wahl 1991), comparison of season intensities is possible if the division of classes is the same. Common application of exponential division (Melgar et al. 2012) simplifies the comparisons, but using the same value for 10-day periods eliminates the variability, which is important for personal assessment of dose-response for pollen-sensitive individuals. It is argued that both the standard and advanced calendars tested in this study can be used as models for forecasting daily concentrations of airborne pollen, if developed at daily temporal resolution.

It is difficult to compare the developed model to other regional models, since observation-oriented models are developed and evaluated for a limited territory. Unless these models adequately describe the emission, dispersion and deposition processes, it is not reasonable to apply them in other regions. It is common practice within the field to use categories (e.g. Stark et al. 1997; Sanchez-Mesa et al. 2002; Makra and Matyasovszky 2011; Castellano-Méndez et al. 2005) or statistical scores (e.g. Zink et al. 2012, 2017) rather than daily concentrations to test the performance of their prediction models. Nevertheless, there is no consensus on how to evaluate model performance in the prediction of exact daily pollen concentrations. Pearson’s or Spearman’s correlation coefficients are typically used in applications where the relationship between the two signals is of chief concern. However, the performance of magnitude prediction is evaluated using a range of indicators such as the coefficient of linear regression, mean-squared error, root-mean-square error, mean absolute error and the simple statistical comparison of the averages of two signals.

On average (over the range of validation years) calendar models yielded correlation coefficients similar to those seen in other studies that used advanced statistical and machine learning methods (Inatsu et al. 2014; Ritenberga et al. 2016). Due to regional-oriented assessment of its performance (Siljamo et al. 2012) calendar and numerical dispersion models cannot be directly compared when predicting daily Betula pollen concentrations. Multi-model ensemble simulations performed slightly better with respect to correlation coefficient and magnitude prediction (Sofiev et al. 2015). For Poaceae pollen, the correlations obtained by standard and advanced calendar models were within the range obtained by observation-oriented models that used meteorology data as an input (Stach et al. 2008; Voukantsis et al. 2010; Rojo et al. 2016; Rodríguez-Rajo et al. 2010; Hilaire et al. 2012). Regarding magnitude, the calendar models have worse performance in comparison with machine learning models that include meteorological parameters as predictors (Rodríguez-Rajo et al. 2010; Voukantsis et al. 2010). For Ambrosia pollen, the calendar models yielded a high correlation coefficient which was notably higher than the value obtained by multiple linear regression (Howard and Levetin 2014). Artificial neural network observation-oriented models that included meteorology (Csépe et al. 2014) were better, in respect of both correlation coefficient and predicting the magnitude of daily pollen concentrations. However, calendar models give predictions for more than 1 day ahead which makes them advantageous. Evaluation of SILAM operational numerical prediction model for Ambrosia focused on predicting total season amounts rather than daily concentration time series (Prank et al. 2013). Tests of COSMO-ART demonstrate good performance at individual observation sites, as well as regionally, which is evaluated based on statistical scores which do not include correlation and RMSE (Zink et al. 2017).

The observed year-to-year variability in recorded amounts of all analysed pollen types (as a result of either masting in birch or different management regimes in grass and ragweed) justifies the production of separate calendar models for below- and above-average seasons, which would improve forecasting of the magnitude of pollen concentrations. This requires an algorithm which determines in advance what kind of season is expected (below or above average). Without such algorithm, the application of multiple models is limited.

It is difficult to evaluate the performance of calendar models with respect to their application for management of allergic symptoms. For the prediction of the start and the end of airborne pollen occurrence season all models performed reasonably well. The biggest errors in prediction were observed for Betula for both performance measures, with the years 2013 and 2015 being the worst in terms of Spearman’s correlation coefficient and NRMSE, respectively (Table 3). Flowering commences in spring which results in sensitivity to the extension of the preceding winter that may cause the extension of dormancy due to later achievement of forcing temperatures (Myking 1999). In order to address shifts in flowering times expected from the ongoing climate change, calendars have to be evaluated and updated on a regular basis. The biggest errors in predicting the magnitude of daily values for all models were observed at high concentrations during the most intensive part of the season. Thresholds for the onset of allergic symptoms are expected to be at much lower concentrations for the majority of sensitive individuals (de Weger et al. 2013 and references therein), and thus, wrong predictions above the thresholds are not expected to significantly affect application of calendar models in management of the allergy.

It is believed that correlation coefficients overestimate the ability of pollen calendars to predict timing and intensity of sudden peaks. The large portion of achieved correlations tends to result from the overall shape of the pollen curve, as it emulates the pre-peak increase and the post-peak decrease, rather than day-to-day variability. It is argued that the time series of pollen concentration represents strongly non-stationary and non-ergodic processes due to the existence of the start and the end of the seasons (Sofiev et al. 2015; Ritenberga et al. 2016), which implies that NRMSE and correlation can be computed only within the main season. Despite limiting analysis to coincide with atmospheric pollen measurements, the effect of using concentration measurements from different fractions of each month may influence the correlation.

The results shown in this work can be interpreted in different manners. If correlation coefficients are the accuracy measure of choice, advanced calendar models did not manage to improve the prediction. However, if NRMSE is observed, it can be concluded that the advanced models brought a significant improvement to observation-based prediction, as they lower the error for as much as 40%, witnessed in birch pollen prediction. The opposing conclusions which can be drawn from this research raise a question of the applicability of various statistical measures in the domain of aerobiology. They can be used for comparison between the models, but none of them can impartially quantify the applicability of a model in aerobiological forecasting. Furthermore, its practical value for the end-users of forecasting systems is debatable. A possible solution for the problem would be to start from the end-users’ point of view and observe the effects that different pollen concentrations have on them. In this way it would be possible to determine the exact amount of error that can be tolerated. Also, positive and negative errors could be treated differently, so that the effects of under- and overestimation could be separately analysed.

Observed variability in the magnitude and timing of concentration peaks is common in aerobiology. Previous studies have shown that atmospheric movement (i.e. speed and direction) and precipitation are responsible for day-to-day variability in airborne pollen concentrations (Barnes et al. 2001; Rodriguez-Rajo et al. 2003), and therefore, taking these environmental factors into consideration is required for further improvement of model performance.

Observation-oriented models are developed and evaluated for a limited territory and based on point receptors (i.e. pollen and meteorology). Therefore, the use of these models outside the region they were calibrated for is unpredictable (Sofiev et al. 2013) and depends on the model’s ability to mechanistically describe natural processes behind records of airborne pollen. This is particularly emphasised for pollen calendars that do not take into consideration atmospheric conditions, which are known drivers of pollen emission and dispersion. For areas with strong and widespread local sources and where local transport dominates over regional and long-distance transport, pollen calendars based on diverse calibration dataset are expected to give representative forecasts even for larger geographical areas. It should be noted here that many authors highlighted the importance of previously recorded concentrations for short-term forecasting of Betula (Inatsu et al. 2014), Poaceae (Stach et al. 2008; Rodríguez-Rajo et al. 2010) and Ambrosia (Matyasovszky and Makra 2011; Csépe et al. 2014). Using auto-regression methods to push the model towards observations, standard calendar models are a good tool for setting the benchmark that will show to what extent adding meteorology among predictors increases model capability to predict processes and thus be more applicable in different regions.

5 Conclusions

In order to maximise the performance of short-term forecasts of airborne pollen concentrations, the models need to account for processes behind emission and dispersion. Although calendars do not consider these, the simple calendar model, produced as the mean of concentrations measured in the calibration dataset, describes well the shape of the pollen curve yielding correlation coefficients comparable to those obtained from more advanced observation-oriented models. The performance of forecasting the magnitude of pollen concentrations improves if median is used instead of mean. Statistical analysis confirmed that with respect to difference in Spearman’s ρ and NRMSE all calendars are equally suitable for forecasting daily concentrations of Betula, Poaceae and Ambrosia pollen over the course of the season. However, it is argued that either the standard calendar or advanced calendar is used instead of calendars produced following Spieksma’s model (Spieksma and Wahl 1991), as these present day-to-day variations in pollen concentrations. Another conclusion is that pre-processing the calibration dataset or decreasing the temporal resolution causes the models to under-represent peaks and day-to-day variability, thus limiting their use in analysis of dose-responses in allergy management. In any case, calendar models are the method of choice when meteorological data are not available and could also serve as the benchmark to assess whether introduction of meteorology in predictions improves performance and thus contributes to the description of processes behind emission and dispersion.