1 Introduction

Globally, pollen allergies are building up, affecting worldwide approximating 10–30% of adults and 40% of children (Beggs, 2004; Bieber et al., 2016; Neumann et al., 2019; Reitsma et al., 2018; Schmidt, 2016) with allergic rhinitis as the most common disease phenotype. There is, however, a large geographical variability within and between countries (Bousquet et al., 2020). In Europe, more than 25% of the adult population and one third of the children suffer from pollinosis (Grant & Wood, 2022; WHO, 2003). Typically, in highly industrialized regions, people experience more allergic respiratory diseases due to anthropogenic air pollution (D’Amato & D’Amato, 2023; Landrigan et al., 2018). In Belgium, a densely populated country in Western Europe with substantial air pollution (Verstraeten et al., 2018), at least ~ 10% of the people develop allergic rhinitis symptoms due to birch pollen (Blomme et al., 2013). Medication and allergy tablets/shots are useful for relieving symptoms, but also preventive measures can be taken to avoid contact to the allergens and as such reducing the symptoms (Roubelat et al., 2020). These preventive actions are more efficient when patients are informed on coming pollen episodes in due time by an operational pollen forecast system. Such a forecast system for birch and grass pollen has recently been established for the Belgian territory (RMI, 2023) based on the pollen emission and transport model SILAM (System for Integrated modeLling of Atmospheric coMposition, http://silam.fmi.fi) (Sofiev et al., 2006, 2012, 2015). However, the question remains how reliable these forecasts are, or stated otherwise, how large is the uncertainty involved in pollen modelling and forecasting.

Since no single model is a true representation of bio-geophysical processes, all model-based estimates are subjected to uncertainty (Beven, 2006; Beven & Freer, 2001). Complex models, such as chemistry transport models (CTM), rely on vast amounts of spatio-temporal model inputs and have many parameters that might interact in a nonlinear way. It is not unlikely that more than one model parameter set fulfills the condition of being the best fit with reference data. Moreover, system variables might be subjected to large uncertainty due to a lack of reliable observations for evaluating the variable value, or because the parameter cannot be measured directly (McCabe et al., 2005). That is why errors arise due to the structure of the model, the applied algorithms, but also due to measurements errors in the observations used for model evaluation (Beven, 2001). For pollen monitoring, these measurement errors might have different sources and the involved uncertainty can be substantially large (see Buters et al., 2022 for an overview).

Notwithstanding that physic-based modelling might suggest that objective results are obtained, non-rigorous choices are always imminent (Beven, 2006; Beven & Freer, 2001). For example, the choice of parameter value ranges, the choice of input data to run the model, the choice of tolerance criteria applied in the model considered, the attribution of expected uncertainties on the input and parameter data. Very simplified, estimating the impact of absolute errors of the model parameters and input variables on the model output errors could be calculated straightforwardly by error propagation theory (Harvard, 2023), if all used data are independent (not correlated), or when the correlations are known. However, there is no absolute knowledge of all possible errors (i.e., the standard deviation of the quantity) in complex models, and nonlinear interactions are common. Another way of assessing uncertainty is based on a Monte Carlo (MC) approach (Kroese et al., 2011). Multiple parameter/input sets can be defined using Monte Carlo sampling to randomly extract parameter sets with pre-defined ranges, without assuming the distribution type. Subsequently, the model is run with each set of parameter values (Beven, 2006; Campling et al., 2002; Verstraeten et al., 2008). From the multiple simulations, some confidence range can be extracted, for instance by sorting model output accordingly to percentiles. The errors of input data, nonlinearity and parameter interactions are implicitly treated. We apply this MC methodology on the SILAM runs, but it can also be applied for quantifying confidence intervals for measurements of airborne pollen concentration (Addison-Smith et al., 2020).

The aim of this study is to investigate whether we can assess the uncertainty in modelling airborne birch pollen levels near the surface using the pollen transport model SILAM by repeatedly running the model 100 times while ingesting 100 randomly sampled selected model inputs such as the spatial distribution of the areal birch tree fractions (i), the spatially distributed start (ii) and the end (iii) of the birch pollen season, the temperature threshold parameter (iv) indicating the ripening of the birch catkins, and a combination of all (v) during the birch pollen season of 2018 in Belgium

2 Data and methods

2.1 SILAM

Pollen modelling with SILAM (http://silam.fmi.fi) (Sofiev et al., 2012, 2015) is based on the fact that pollen grains are biogenic aerosols with a diameter of typically 5–50 times larger than conventional atmospheric aerosols, but density approximately 20–30% of those (Sofiev et al., 2006). The model simulates the processes such as advection with wind, mixing due to turbulence, gravitational settling (dry deposition), and scavenging with precipitation (wet deposition) of pollen of various vegetation species (Kouznetsov & Sofiev, 2012; Siljamo et al., 2012; Sofiev, 2002; Sofiev et al., 2010). In the current study, SILAM is applied with a horizontal grid cell resolution of 0.1 × 0.1° and a vertical grid consisting of nine uneven stacked layers (thickness of layers from surface to the free troposphere is 25, 50, 100, 200, 400, 750, 1200, 2000, and 2000 m).

Airborne birch pollen modelling in SILAM is based on the temperature degree day approach or the thermal time flowering model (Linkosalo et al., 2010; Sofiev et al., 2006, 2012). It is assumed that the timing of birch flowering is driven by accumulated ambient temperature using a cut-off temperature of 3.5 °C (heatsum) and that the birch season starts on March 1. From a certain threshold value on, flowering starts. This is the pollen onset threshold and is quantified as the heatsum start. The end of the flowering season, or offset, is defined by a second threshold value (Linkosalo et al., 2010), or the heatsum end (or the difference between end and start). These heatsum thresholds are based on climatology. The spatial distribution (gridcell resolution is 1° × 1°) of the temperature sums for the start and end of the pollen season are illustrated in Sofiev et al. (2012). The cumulative fraction of pollen released during the main flowering season (defined by the heatsum start and end) is assumed piecewise linear and proportional to the temperature sum during the main flowering season. Transport, emission and deposition in SILAM is driven by ECMWF ERA5 reanalysis meteorological data (grid cell of 0.25° × 0.25°) (ECMWF) for the birch season of 2018 in Belgium. Out of 38 years, the 2018 birch pollen season was chosen for this error estimation study since the Seasonal Pollen Integral was quite large for Belgium (> 16,500 grains/m2 at Brussels) and the evaluation of the modelled daily birch pollen levels against observations showed good results for most monitoring sites (i.e., R2 = 0.76 for Brussels, see also Verstraeten et al., 2022). According to an assessment of the meteorology-induced year-to-year variability of seasonal pollen index, the related uncertainty is between 10 and 20% quickly growing toward the coastline (Sofiev, 2016).

The simulation of birch pollen levels in the air requires the quantification of the spatio-temporal distributions of emission sources of birch pollen in the model domain. A map with the areal fractions of birch trees is fundamental as underlying model input. At the European scale, such a map was first compiled by Sofiev et al. (2006) and refined by **Sofiev et al. (2012). This general European map was updated for Belgium using Flemish and Walloon forest inventory data on a spatial resolution of 0.1° × 0.1° (Verstraeten et al., 2019) and used in combination with NDVI data (Normalized Difference Vegetation Index) in a Random Forest approach to reconstruct seasonal birch pollen emission maps for the period 1982–2019 (Verstraeten et al., 2022).

2.1.1 Assessing the model uncertainty using a Monte Carlo error approach

Since SILAM is a complex CTM with spatially distributed model inputs and parameters, with potential nonlinear interactions and without proper knowledge on possible errors of all model variables, we aim at assessing uncertainty based on a MC sampling approach (Verstraeten et al., 2008). From previous research (e.g., Verstraeten et al., 2019, 2022, 2023), and mentioned before, we know that the most important pollen related model inputs are the spatial distribution of birch pollen emission sources (areal birch tree fraction map) (i), the start (ii) and end (iii) of the birch pollen season (using the heatsums). Furthered, we also examine the impact of varying cut-off temperature of 3.5 °C on airborne birch pollen levels (iv), and the combined effect. For each model input/parameter we realize 100 randomized samples from a uniform distribution (all code written in IDL8.2). The next error scenarios are considered:

Scenario 1: For the birch pollen emission sources, we divide the native 0.1° × 0.1° reference areal birch fraction map of Belgium (Fig. 1, top left panel) in 1° × 1° blocks. Hence, for each block, we have 100 gridcells (10 × 10 gridcells of 0.1° × 0.1°). Per block, we randomly allocate the 100 gridcells with birch tree fractions generating 100 different birch pollen emission source maps with respect to the sum of all trees per region. Consequently, the amount of birch trees and pollen emitted in a large 1° × 1° block is not altered, only the location of the trees (Fig. 1, reference map and four illustrations of randomized maps). The 100 different birch pollen emission source maps are summarized in Fig. 2 by its 1, 5, 25, 33, 50, 67, 75, 95, 99 percentiles. Next, we run SILAM 100 times for the birch pollen season of 2018 using these 100 different emission maps, while fixing all other inputs, generating 100 different spatio-temporal airborne birch pollen levels near the surface. The spatial distribution of the resulting pollen levels for April 15th 2018 is illustrated in Fig. 3 and ranked based on the 1, 5, 25, 33, 50, 67, 75, 95, 99 percentiles. In this scenario, we essentially analyze the effect of a random distribution of birch pollen emissions sources on the pollen season uncertainty addressing the question on how important the right location of the birch trees is.

Fig. 1
figure 1

Spatial distributions of areal fractions (percentage) of birch trees in Belgium. On the left, the applied reference map (taken from Verstraeten et al., 2022). From this map, 100 emission maps, input in SILAM, are derived using a randomized uniform distribution of areal birch fractions based on 1° × 1° blocks with each 10 × 10 gridcells. The first four samples from the randomized areal birch fractions from the 1° × 1° blocks are shown on the right

Fig. 2
figure 2

Summary of the spatial distributions of areal fractions (percentage) of birch trees in Belgium input in SILAM represented by the 1, 5, 25, 33, 50, 67, 75, 95 and 99 percentiles of the areal birch fractions for the 1° × 1° blocks of 10 by 10 gridcells each

Fig. 3
figure 3

Spatial distributions of the airborne birch pollen levels near the surface for April 15th 2018 arranged according the 1, 5, 25, 33, 50, 67, 75, 95 and 99 percentiles based on the 100 simulations with the SILAM model. SILAM uses 100 randomized areal birch fraction maps produced from a randomized sample taken from a uniform distribution for each 1° × 1° block (see Fig. 1)

Scenario 2: The native gridcell of the spatial distribution of the heatsum is 1° × 1°. Therefore, we first regrid the heatsum on a 0.1° × 0.1° gridcell prior to defining the same 1° × 1° blocks as for the emissions (the native 1° × 1° grid of the heatsum is different from the 1° × 1° blocks with emissions). Similar to the randomized emission case, we then have 100 gridcells per block and we randomly allocate the 100 gridcells with the heatsum start for each block generating 100 different maps with the heatsum start. With all other inputs fixed, we then run SILAM 100 times using these 100 different heatsum start maps. Based on this scenario, we can evaluate the sensitivity of SILAM for changing heatsum start values.

Scenario 3: For the heatsum end, also known as the heatsum diff, we apply the same procedure and run SILAM for 100 times while the other inputs are kept constant (fixed).

Scenario 4: The 100 different emissions maps, the 100 different heatsum start and the 100 different heatsum end are all used to run SILAM 100 times, resulting in the combined effect of these three inputs on the airborne birch pollen levels near the surface. In this scenario, the propagated effect of uncertainty from the basic input maps on modelling birch pollen is quantified.

Scenario 5: We produce 100 randomized cut-off temperatures (original value 3.5 °C) between 3.0 and 4.0 °C. We combine these with the 100 different input maps of Scenario 4, to have a combined effect of all considered model inputs related to the pollen transport module of SILAM. In this scenario, we intrinsically analyze the effect of potential errors of the ECMWF air temperatures on the birch catkins ripening.

We have performed 100 SILAM runs for each of the five scenarios. For each scenario and for each day of the 2018 birch pollen season, we obtain 100 output maps with the spatially distributed airborne birch pollen levels near the surface. For each day and for each gridcell, we compute the relative Coefficient of Variation (CV%) which is the standard deviation divided by the mean value of the day. In Fig. 4, we illustrate for 1 day—April 15th 2018—the 100 different pollen level outcomes from the 100 SILAM runs, including the mean value, the standard deviation and the CV%. The computed cumulative distribution function for that day shows a linear increase which is an indication that the 100 output values follow a uniform distribution. Thus, the mean and standard deviation for computing the CV% are taken from a uniform distribution. The spatial distribution of the uncertainty related to modelling airborne birch pollen levels for the 2018 season is then based on the mean CV% of the time series for each gridcell (when the daily mean is at least one birch pollen grain). This is done for each scenario.

Fig. 4
figure 4

One day out of the 2018 time series (April 15th 2018) of modelled daily birch pollen levels. 100 simulations resulted into 100 outputs of birch pollen levels. The minimum and maximum are shown. In blue, the cumulative distribution is given. The linear increase is an indication of a uniform distribution of the 100 values. The corresponding mean, standard deviation and CV% values are given

For five locations in Belgium that coincide with operational pollen monitoring stations using a Hirst-type 7-day volumetric spore sampler (Burkard Manufacturing Co., Rickmansworth, UK), namely in Brussels (50.825° N/4.383° E), Genk (50.965° N/5.495° E), De Haan (51.274° N/3.022° E), Tournai (50.614° N/3.387° E), and Marche-en-Famenne (also written here as M-F, 50.200° N/5.312° E), we have extracted the time series including the mean, maximum and minimum of the 100 outputs as an indication of the expected variation induced by the uncertainty of the different model inputs (five scenarios). This is presented for Brussels in Fig. 5. We also show the time series of percentile 5 and 95 as illustration.

Fig. 5
figure 5

The 2018 time series of modelled airborne birch pollen levels near the surface for Brussels based on 100 simulations from SILAM. The black line is the mean time series, while the gray zone represents the total range of the simulations (minimum, maximum). From top till bottom panels: time series based on 100 SILAM runs only using 100 different birch pollen emission maps (Scenario 1, see also Fig. 1); 100 runs using only 100 different heatsum maps indicating the start of the birch pollen season (Scenario 2); 100 runs using only 100 different heatsum maps indicating the end (Diff) of the birch pollen season (Scenario 3); 100 runs combining 100 different emission maps, 100 different heatsum maps for start and 100 heatsum maps for the end (Diff) (Scenario 4); 100 runs combining 100 different emission maps, 100 different heatsum maps for start, 100 heatsum maps indicating the end (Diff), and 100 different threshold temperatures for heat accumulation (Scenario 5). The percentiles 5 and 95 are illustrated

To summarize all the steps in an overview, a flowchart of the applied procedure to assess the uncertainty in airborne birch pollen modelling is given in Fig. 6.

Fig. 6
figure 6

Flowchart with the applied steps to obtain the uncertainty in birch pollen modelling. Example for Scenario 1 only, but similar for the others. Start with the reference areal birch fraction map with a resolution of 0.1° × 0.1°. This map is divided into 1° by 1° blocks. Each block contains 100 cells. 100 random samples are taken from a uniform distribution in each block which results in 100 different areal birch fraction input maps. For every input map SILAM is run returning 100 output maps of airborne birch pollen levels near the surface for each day of the pollen season. From the 100 runs the mean CV% is computed for each gridcell, and also time series can be extracted. For the other scenarios, the reference input is different (heatsum start, heatsum end, combinations and also the threshold temperature, or a combination)

3 Results

3.1 Uncertainty assessment of modelled airborne birch pollen time series in Belgium

The 100-fold time series of airborne birch pollen levels for the location of Brussels are shown in Fig. 5 for the five different error scenarios based on the daily range and mean in pollen levels. When addressing the question on the importance of the right location and amount of birch trees (Scenario 1) (Fig. 5, top panel), we observe that the uncertainty is on average ~ 14% for Brussels (see also Table 1), peaking to 25%, and less than ~ 10% at the start and the end of the birch pollen season. The seasonal behavior of this scenario is as expected with one large pollen peak at the start of the birch pollen season. For the other locations, the average uncertainty is slightly higher, especially for Genk in the North-East of Belgium (Table 1).

Table 1 The relative Coefficient of Variation on airborne birch pollen level modelling for each Scenario and each station location based on 100 simulations of SILAM with Monte Carlo generated model input. The standard deviation and mean are computed for a uniform distribution. M-F is Marche-en-Famenne

The impact of the heatsums (Scenario 2 and Scenario 3), however, is much larger than for the exact locations of the emission sources (Brussels, ~ 21%, and much more for Marche-en-Famenne, Tournai and Genk). This is an indication that the exact location of the heatsums is much more important than those of the emissions sources. From the time series in Fig. 5, it can be observed that the variations in the heatsum start affect the birch pollen peaks in a major way. Using a less optimal value (i.e., for another location) might even shift the large peak at the start of the season toward two somewhat smaller peaks, one at the start and another one near the end of the season. This changes the seasonal behavior of the airborne birch pollen distribution in a significant way. The peak of April 9 in Scenario 1 (which is a typical behavior for Brussels) is reproduced in Scenario 2, but another peak of the same order of magnitude appears on April 19th, which is 2.3 times larger compared with the smaller peak in Scenario 1 for the same day. The impact of uncertainty in the heatsum end (Diff, Scenario 3) is in general slightly higher at the different locations compared with Scenario 2, expect for Genk, where the error is much larger (33.6% instead of 28%). Scenario 3 shows similar seasonal behavior as Scenario 1, but for Brussels, the mean peak (3232 grains/m3) and the range (2240–4224) at the start of the season are much larger than for Scenario 1 (2160, 1600–2720 grains/m3, respectively). Smaller heatsums for the end will shorten the pollen season. Since the same amount of pollen must be emitted by the model during a smaller time interval, it results in a higher concentration of airborne pollen.

The combined effect of varying emission sources, heatsum start and heatsum end (Scenario 4) is on average more than 30% for Brussels, and 40% for Genk. This can be attributed to the large impacts of the heatsums (Scenario 2 and Scenario 3). Compared to the other scenarios the peak at the start of the season is smaller, but the range of the simulations is larger. At the end of the pollen season, the range in the simulations is higher.

Finally, in Scenario 5, also, the impact of variations in threshold temperatures on the catkin ripening is added to the other error sources showing uncertainty of ~ 37% for Brussels, ~ 35% for De Haan, and more than 42% for Marche-en-Famenne and Genk. Variations in the cut-off temperature of ± 0.5 °C affect the pollen simulation. Based on simple backward error propagation computations, it is estimated that uncertainty of this temperature accounts for ~ 14–18% error in simulating airborne birch pollen levels, which is comparable with the CV% for Scenario 1, (changing emissions only). Note that the threshold temperature is fixed in space. Each cut-off temperature value out of the 100 samples is applied for the entire simulation domain.

3.2 The spatial distribution of the uncertainty related to modelled airborne birch pollen in Belgium

By considering the integrated CV% over time (as illustrated for the five station locations in Table 1) for each gridcell, we obtain the spatial distribution of uncertainty in modelling airborne birch pollen in Belgium as illustrated in Fig. 7 for the five error scenarios. Concerning the impact of birch pollen emissions sources on the model uncertainty (Scenario 1), the spatial variation of the average error in Belgium is rather small, around ~ 15% for large parts of Belgium, while it doubles (> 30%) in eastern and southern parts of Belgium especially at locations with high birch tree fraction (> 35%) (see also the reference map in the left panel of Fig. 1).

Fig. 7
figure 7

The spatial distribution of the relative Coefficient of Variation when modelling airborne birch pollen levels near the surface for the 2018 season derived from 100 SILAM runs for different input scenarios (only emissions, only heatsum start, only heatsum end (diff), the combination of all three, and finally the combination with varying threshold temperatures)

When the heatsums are considered, the spatial variation of the uncertainty increases by at least 5%. Similarly to Scenario 1, the higher values coincide to the locations with substantial birch trees as shown in the left panel of Fig. 1. This is trivial, since locations with more pollen emission sources will be more affected by errors in heatsum values than locations with only few birch trees. Some locations westwards and eastwards from Brussels have lower CV% values in Scenario 2 and Scenario 3 than in Scenario 1, probably thanks to spatially more homogenous heatsum values, and the more heterogeneous areal birch fractions due to the large urbanized region of Brussels.

By combining the three major error sources in birch pollen modelling, an increase in uncertainty occurs from West to East and from North to South by adding up all the uncertainty originating from the three major pollen inputs. The largest CV% values up to 50%. are observed in the South-East. When also the cut-off temperature is considered as a source of error, large parts in Belgium show CV% values of more than 40% with higher values in the South-East. Generally, 3 to 5% in CV% were added from Scenario 4 to Scenario 5.

4 Discussion

Overall, the largest CV% errors on modelled birch pollen levels near the surface can be found on locations with relatively high amounts of birch trees (Fig. 7) due to the higher presence of birch pollen emissions sources. This could be expected since in areas poorly populated by birch trees, the model does not compute significant pollen emissions that could introduce such variability. Additionally, at places in Belgium with higher contrasts between low and high amounts of birch trees, the overall CV% error is the largest (i.e., eastwards, near the Netherlands). This can be explained since misplacing birch trees in a rather homogenous area has fewer impact on the right emission source than in a more heterogeneous area or in an area where the contrast between high and low amounts of birch trees is more pronounced. This can be clearly observed when comparing the reference birch tree distribution map in Fig. 1 (left panel) with Fig. 7 (maps of Scenarios 1, 2, 3). Putting the right amount of birch trees on the right spot is important, but it is not the largest source of error in modelling the airborne birch pollen levels. The error contribution of the right heatsum start is somewhat larger than for the emission sources (Fig. 7, maps of Scenario 1 and Scenario 2). And again as expected, on locations with more birch trees, the error increases. The impact of the end of the season (heatsum end/diff) is of the same order of magnitude (Fig. 7, Scenario 3), but it is more pronounced in the southern parts of Belgium on the borders with France and Luxemburg. This might be caused by the relatively coarse spatial resolution of the heatsum maps (1° × 1°) and the West–East gradient in heatsum values (larger for the heatsum end than for the start, see also maps in Sofiev et al., 2012). Although the heatsum end produces more uncertain results on the pollen modelling than the heatsum start (the average CV% is slightly higher, Fig. 7, maps of Scenario 2 and Scenario 3), the choice of the latter affects the seasonal dynamics of the daily airborne birch pollen levels in a substantial way as is illustrated in the temporal patterns in Fig. 5, panel 2. The wrong choice might change a birch pollen season with a large peak at the start of the season (for instance at Brussels) toward a pollen season with two peaks (one extra at the end of the season, i.e., at Genk) or a shift to a higher peak toward the end of the season which is more typical for the location at Marche-en-Famenne, but also at the coast (De Haan) (Verstraeten et al., 2019, 2022). So, it seems that the sensitivity of this parameter on the model outcome is substantial. Combining all the sources of uncertainty in Scenario 4 (Fig. 7), the CV% increases from West to East and especially in the southeastern part of Belgium. The uncertainty on the cut-off temperature (due to uncertainty on the threshold or/and on the temperature data), a fixed value, is in the same order of magnitude as for emission sources, between 15 and 20%. Again, it is not very surprising that it is increasing the total error on locations with birch trees (Fig. 7, Scenario 5). This means that finding the right amount of birch trees on the right place with the right ripening of catkins is subject to a CV% error of 30%. Due to the fact that most error sources are connected (the sources of error or not independent), in the pollen module inputs, the overall CV% is in general below 50%.

The only standard method for monitoring airborne pollen today relies on Hirst-type traps (Galán et al., 2014; Hirst, 1952). Notwithstanding it is, like all observation systems, sensitive to measurement errors (Oteros et al., 2017). Despite the widespread use, a number of drawbacks of the Hirst method exist as listed by Buters et al. (2022), including a bias and/or error of approximately 30% in the air flowrate (i); a critical distance issue between the inlet and the deposition surface (ii); the risks of pollen position shift or elimination during sample preparation for the microscope (Galán et al., 2014) (iii); different pollen stick depending on the temperature and glue type (Maya-Manzano et al., 2016; Rojo et al., 2019) (iv); the sampled area of the slide; errors of more than 30% are reported (Adamov et al., 2021; Smith et al., 2018) (v); oversaturation of the adhesive surface and building up of several layers of pollen grains during very high pollen peaks (vi); the human error associated with pollen identification and counting (around 30%, Smith et al., 2018) (vii); meteorological conditions that affect the sampling efficiency (wind, physical properties of the particle, Frenz, 2000; West & Kimber, 2015) (viii). As mentioned in Buters et al. (2022), tests showed that a general measurement error of between 76 and 98% for daily average values can be expected.

Based on the upper and lower confidence interval reported by Addison-Smith et al. (2020), a 19–42% error can be derived when counting grass pollen grains from different transects collected from a Hirst type of pollen monitoring device. Errors in pollen counts using standard sample sizes (20% of the target surface) may range from 7 to 55% of the mean value (Gottardini, et al., 2009). The derived CV% from these datasets when considering pollen count means of at least one per m3 is ~ 50%. What is more, the recorded taxa range between 46 and 78% of identified species, thus the error of missing taxa is between 54 and 22%. Apart from the counting error also an identification error must be considered. In earlier research, Comtois et al. (1999) reported that only for concentrations of more than 500 grains/m3; the mean error could remain below 30% based on standard aerobiology counting protocols. More recently, the uncertainty of the Hirst-trap-based observations was reviewed and its error estimate substantially increased, pointing out to a systematic error of ~ 34% (Rojo et al., 2019). The observed daily variability could be explained by random variations caused by instrumental error, reading error, but could also be influenced again by meteorological conditions as mentioned before.

In this study, if all uncertainty in crucial model inputs related to the pollen module of SILAM are concerned, the CV% is up to 43% for the selected locations (Table 1), and higher when the spatial values of the CV% are considered (Fig. 7, Scenario 5). This is without taking into account the potential errors and/or biases of the meteorological data (except for the cut-off temperature, Scenario 5). Assuming an accumulated error of 20–40% from all meteorological data and following simple error propagation theory, then, the CV% increases up to ~ 48–59% (computed as sqrt(472 + 202) and sqrt(472 + 402)). These numbers have the same order of magnitude as the observation errors from the Hirst-type data as mentioned before. As a consequence, when comparing airborne birch pollen levels from a model with observations, a large overlapping interval can be expected based on the almost equally large CV%. It is also possible to derive relative confidence intervals when a certain statistical distribution is assumed based on the reported CV%. This can be done by dividing the CV% value with the square root of 100 (the amount of simulation) and then by multiplying with the appropriate z-score which depends on the statistical distribution and the intended confidence interval. For example, a CV% of 20% for a 2.5–97.5% confidence interval (2-tails) from a Student t distribution will result into a relative interval of 100 ± 1.984 × 20/10 ~ 100 ± 4% or 96–104%. Apart from the CV%, another popular descriptive statistics is the interquartile range (IQR) which is a measure of statistical dispersion or spread of the data without explicitly assuming a statistical distribution. It is the difference between the 75th and 25th percentiles of the data. This statistical approach was previously used in assessing the error for pollen observations (Adamov et al., 2021).

When estimating the relative error in modelling or monitoring airborne pollen grains, most reported error values are only considered when the mean daily pollen concentrations are large enough (i.e., 10 or more daily birch pollen grains /m3). Otherwise, since this mean value occurs in the denominator for computing the relative error, very small daily levels will cause an exponential increase in error. This might have consequences for highly sensitized rhinitis patients, for who the exposure to a few birch pollen grains in the air might already cause allergy symptoms. If these patients want to use birch pollen forecasting tools for organizing their activities and medical doses, then, they would face large uncertainty in the expected birch pollen levels near the surface. Only for days with high birch pollen concentrations (100 and more, Pfaar et al., 2017), the model uncertainty might be reasonable.

5 Conclusions and recommendations

The uncertainty in modelling airborne birch pollen levels near the surface was quantified by executing SILAM for the birch pollen season of 2018 in Belgium based on a Monte Carlo error approach and the relative Coefficient of Variation (CV%) as descriptive statistics. This implies that SILAM was run repeatedly using different sets of 100 randomly sampled selected model inputs (including emissions sources, heatsum start/end and the cut-off temperature for catkins ripening).

The impact of the spatial uncertainty of pollen emissions sources on modelled birch pollen levels in Belgium ranged between ~ 15 and ~ 35%. Considering also the effect of the uncertainty in heatsum start and end, CV% values up to 50% are found in the southeastern parts of Belgium. Taking into account the uncertainty in the cut-off temperature as error source, then, the CV% values increased with in general another 5%. Additionally, by assuming an accumulated error of 20–40% from all meteorological data, a CV% value near 60% can be expected. These error values in modelled pollen levels are in the same order of magnitude than the reported errors in monitored pollen levels, based on the reference Hirst method.

How can we reduce the uncertainty in modelling birch pollen levels near the surface? One approach would be an improved inventory of the areal fractions of birch trees at more detailed spatial scales. More and more cities provide detailed inventories of their vegetation even by identifying individual trees (Ma et al., 2023). Moreover, applying the increasing availability of remotely sensed data into Machine learning and artificial intelligence environments could provide the community with extra data for running CTM’s (Lumnitz et al., 2021). Even after aggregating the data back to the 0.1° × 0.1° gridcell, it would reduce the uncertainty of the emission sources without introducing unforeseen issues with atmospheric processes that might be scale dependent. After all, a random resampling of emissions sources based on fine spatial resolutions in a MC error approach would produce more similar values, if the area is homogenous enough.

The spatial variation of the accumulated temperature (heatsum) that determines the start of the pollen season for different climatology at much finer spatial scales would definitely reduce the uncertainty in birch pollen modelling. The current maps are very coarse (1° × 1°), so more data on micro-climatic conditions would improve the pollen modelling. Especially, if they are combined with information on soil type, nutrient and water availability. Some of these datasets already exists on detailed scales, but efforts on using the data in an integrated way should be promoted. This information can also help to improve the end of the season estimate that suffers from the same spatial resolution issue.

Finally, the implemented meteorological datasets in SILAM used in this study have a native grid of ~ 0.25°. Reducing the uncertainty and impact of the cut-off temperature on pollen level modelling can be expected when the atmospheric temperature near the surface and other meteorological parameters is available with more details. New rich observational and forecasted datasets will become available thanks to collaboration efforts in large projects. Moreover, new instruments will become available for monitoring airborne pollen in a more automated way (Buters et al., 2022, among others), new procedures for optimized monitoring locations are developed (Sofiev et al., 2023), and the forcing of new more automatically collected measurement data into pollen transport models (Adamov & Pauling, 2023) all can help to reduce the uncertainty in pollen forecasting.