1 Introduction

In the context of emergency response to the accidental or unexpected release of harmful substances into the atmosphere, numerical air-pollution models represent a powerful tool for predicting the dispersion of a pollutant, its concentration, and the possible affected areas. Hence, it is fundamental to assessing and evaluating the performance of dispersion models in conditions of unexpected and discontinuous emissions when the input and atmospheric boundary conditions and source characteristics are often poorly known, or even unavailable. The situation is even more complex at urban sites due to the interaction of the flow, and of the plume or pollutant cloud, with the buildings. In addition, model performance is a critical issue in densely populated areas due to the potential high impact of pollutant releases. An extensive research programme for the evaluation of dispersion models in urban environments and for emergency-response applications has been carried out during the COST (European Cooperation in Science and Technology) programme Action ES1006 “Evaluation, improvement and guidance for the use of local-scale emergency prediction and response tools for airborne hazards in built environments” (COST ES1006 2012), where both continuous and puff releases have been considered in real field campaigns and wind-tunnel experiments (COST ES1006 2015a). Based on a statistical analysis, the robustness of the models was confirmed even in simulating short and transient events. More advanced models were found to provide improved performance, even if the accuracy of the results was not always acceptable. While sensitivity analyses showed that the availability of appropriate meteorological data used for input to the models plays a fundamental role for obtaining reliable results, the models proved to be robust enough even when dealing with poor input information, as is often the case in accidental releases. It was recommended to consider a balance between the model performance and reliability and the run-time effort when choosing the modelling approach. In fact, while a rapid response is necessary, a fast but inaccurate model result can compromise the effectiveness of a response action. It was determined that different modelling approaches should be applied in the different phases of the response process, including the preparatory, emergency and post-analysis phases. The pertinence and applicability of the standard evaluation protocols and statistics for the evaluation of the model performances, in the emergency-response context and urban areas, was investigated and recommendations were provided in the Model Evaluation Protocol (COST ES1006 2015b). It was emphasized that, in addition to the standard statistical metrics, an important indicator of a model’s “fitness for purpose” is the correct prediction of the spatial and temporal extension of risk zones or affected areas. The Best Practice Guidelines (COST ES1006 2015c) integrated the results obtained and the analysis performed, tailoring them to the needs of the emergency responders, decision makers and stakeholders. A summary of the COST ES1006 Action outcomes and a discussion on the remaining open issues have been addressed in Trini Castelli et al. (2016).

In the UDINEE (Urban Dispersion INternational Evaluation Exercise) Project (Hernández-Ceballos et al. 2016), an intercomparison of different modelling approaches was performed for the Joint Urban 2003 (JU2003, Allwine and Flaherty 2006) field campaign in Oklahoma City, U.S.A., and the results discussed in Hernández-Ceballos et al. (2018a, b). The present work has been conducted in the framework of the UDINEE Project.

Together with the possibly scarce information available, such as high-quality observed meteorological data, a major issue for the appropriate simulation of pollutant dispersion is related to the intrinsic variability and uncertainty of the atmospheric variables used as inputs to initialize the models. It is important to have an estimate of such uncertainty, since it may lead to a significant bias in the correct simulation of the plume dispersion (Schatzmann and Leitl 2011) and, secondly, it also affects the evaluation of the model performance (Anfossi and Trini Castelli 2014).

The rationale of the work presented here is to investigate the influence of the choice of the meteorological data supplied as input to the model, on the predicted concentration distribution and on its representativeness of the observed pattern. The main input variables needed for the air-pollution model are the wind speed and direction, which determine the transport of the pollutant. In addition, an estimation of the turbulent variables governing the diffusion of the plume and the entrainment of ambient air is fundamental. In real scenarios, and especially in emergency conditions, when often only single values of wind speed and direction are available as initial and basic information for the model simulations, we focus on the sensitivity of the simulation results connected to the wind speed and direction used as input, and assess the differences in the model results related to the possible variability of the chosen input quantities. For this purpose, we considered two different wind velocities (wind speed and direction) for input to the model simulation of similar suitability for the initial conditions; these are available from the datasets of the JU2003 experiment for application to the UDINEE Project. We also quantify the uncertainty linked to the meteorological input variables through a statistical analysis. Simulations were performed with a modelling system combining a diagnostic mass-consistent flow model and a stochastic Lagrangian particle dispersion model.

Section 2 describes the experiments selected among the 10 intensive operating periods (IOP) of the JU2003 campaign, and Sect. 3 presents the modelling system used, and Sect. 4 evaluates the effect of the two suitable inputs on the simulation results with reference to the observations. Conclusions are presented in Sect. 5.

2 The Selected Experiments

Two alternative input datasets are considered for establishing the flow field driving the puff dispersion simulations. In one case (hereafter denoted the MW simulation), we use the wind speed and direction measured at the PWIDS15 (Portable Information Data System) sensor located 10 m above the roof of the city post office and 1 km upwind of the central business district, considered as representative of the incoming flow in the UDINEE intercomparison exercise. Wind speed and direction are supplied as a 10-s averaged time series. In the other case (denoted the AW simulation), we use the wind speed and direction calculated as the overall averages of 80 fixed anemometers placed at street level, on building roofs, as well as in the suburbs and nearby airports, and over each IOP of 8-h duration (Zhou and Hanna 2007, Hanna et al. 2007). Thus, in the AW simulations, constant values are assigned for the wind speed and direction for the full IOP period and all its releases. The two input datasets are examples of two different situations. The AW simulation represents a typical case where minimum information is available: constant values for wind speed and direction, yet broadly representative of the wind field in the area of interest. The MW simulation corresponds to a situation where more sophisticated information is available: time-varying observations from a measurement station, which is suitable as a reference for the incoming flow over the complex urban area because of its location.

We examine the differences between the two wind-velocity configurations for the MW and AW simulations for each IOP, and have selected four based on whether the two wind directions are similar or different. In this way, we have the option to evaluate and quantify the effect of relatively small or large differences between the two alternative flow input data on the puff dispersion, concentration field, and affected area. Based on these criteria and on the quality of the observed data, we selected IOP 3, 5, 7 and 8; see Table 1 for their main details.

Table 1 Characteristics of IOP 3, 5, 7 and 8 (time of each release and total emitted mass of SF6), the wind speed and direction used as input for the AW simulations, and the time-averaged wind speed and direction for the input data of the MW simulations

Figures 1, 2, 3 and 4 illustrate IOP 3, 5, 7 and 8, respectively, including the two velocity vectors of size proportional to the wind speed, as well as time series of the wind speed and direction used to initialize both the MW and AW simulations; a time average was calculated based on the entire IOP period for the MW case (see Table 1). Also indicated (numbered red circles) are the positions of the concentration samplers deployed during the field campaign. The 10-s averaged wind speed and direction supplied to modellers are further averaged at 5-min intervals for use in the MW simulations, and these data are presented in Figures 1, 2, 3 and 4, which also highlight IOP3 and IOP7 having similar wind directions between the MW and AW simulations. Despite the presence of discrepancies at certain times, the typical rotation between the wind directions of the MW and AW simulations is of the order of 10° with a maximum of about 20°. In contrast, IOP5 and IOP8 have more evident differences, showing a rotation of wind directions between the MW and AW simulations ranging up to 80–90° in some time frames. Each IOP took place in summer, and yielded four independent puff releases of sulphur hexafluoride (SF6) close to the ground and emitted every 20 min, as indicated in Table 1. The morning IOP3 began at 0800 CST (Central Standard Time) for a wind speed typically above 3 m s−1, but higher on average for the MW than for the AW simulation, with the wind direction often coinciding between the two, and directed towards the eastern sector of the concentration samplers. The afternoon IOP5 began at 1400 CST with the first of the four emissions, with both wind direction and speed different when comparing the MW and AW cases, directed towards samplers to the west (east) for the MW (AW) simulation. In addition, the wind speeds for the MW simulation are systematically higher than those for the AW simulation by more than 1 m s−1, occasionally by more than 2 m s−1, causing not only different impact zones, but also a different behaviour of the local transport and dispersion. The early morning IOP7 began at 0400 CST with a relatively low wind speed < 3 m s−1, which is systematically smaller for the MW simulation than for the AW simulation, with the wind directions showing a substantial consistency for the two cases, directed towards the western side of the area covered by the samplers. The second early morning IOP8 also started at 0400 CST, but, in this case, the wind directions for the MW and AW simulations show significant differences, typically pointing towards the western (eastern) border of the sampler distribution for the MW (AW) case. The wind speeds are different, but, in this case, the wind speeds in the AW simulation are almost twice as high as those in the MW case. The chosen experiments document a relatively large spectrum of dispersion conditions resulting from the different features of the two sets of wind speeds considered. As we consider cases having both a consistent wind speed and direction, consistent wind speed and inconsistent direction, and both inconsistent wind speed and direction, the scenarios represent typical situations one must consider during a possible accidental emission event, which must be reconstructed within a reasonable response time based on the data actually available.

Fig. 1
figure 1

The IOP3 configuration (left) and time series of wind speed and direction (right) measured at the PWIDS15 anemometer (MW case, blue) and averaged over all anemometers (AW case, red). The buildings are shown in grey (left), the arrows indicate the mean wind speed and direction for the MW (blue) and for AW (red) simulations, with the size of the arrows proportional to the wind speed

Fig. 2
figure 2

As in Fig. 1, but for IOP5. Note that the wind-direction scale is not the same as in Fig. 1

Fig. 3
figure 3

As in Fig. 1, but for IOP7. Note that the wind-direction scale is the same as in Fig. 1, but not in Fig. 2

Fig. 4
figure 4

As in Fig. 1, but for IOP8. Note that the wind-direction scale is different to the previous figures

3 The Model and the Numerical Simulations

Numerical simulations are performed with the Micro-Swift-Spray (MSS hereafter) modelling system (Tinarelli et al. 2007) composed of the diagnostic mass-consistent flow model MicroSwift and the Lagrangian stochastic particle model MicroSpray (Tinarelli et al. 1994, 2012) for modelling the dispersion. The MicroSwift model provides the required mean and turbulent flow fields based on local measurements or on the outputs of atmospheric models, and is an obstacle-resolving model accounting for the influence of the buildings on the flow, turbulence and dispersion.

In principle, Lagrangian particle modelling is a suitable approach for describing instantaneous puff releases because of its ability to describe in detail intermittent or generally non-stationary emissions. However, the Lagrangian particle model is an ensemble-average model and, as such, has limitations in resolving transient physical processes at the smallest time scales. In particular, when dealing with puff releases, as also verified in wind-tunnel experiments (Berbekar et al. 2015), releases in exactly the same controlled conditions take different paths due to the intrinsic stochastic nature of turbulence, particularly atmospheric turbulence. Ensemble-averaged models are unable to reproduce this intrinsic variability and the atmospheric fluctuations, and cannot resolve the entire spectrum of turbulence. While an experiment provides a single realization of the dispersion process, the output of an ensemble-average model corresponds to the average of an ensemble of realizations.

For Lagrangian particle models, even releasing a large number of particles does not resolve the effective intrinsic variability of the motion due to atmospheric turbulence; the simulation results are more refined from a statistical viewpoint, but still represent an ensemble average. A possible solution to better depict the real atmospheric variability would be to force the Lagrangian model with variable meteorological and turbulence fields at high resolution to explicitly resolve the larger turbulence scales and the resulting fluctuations. The time resolution of the driving fields should be of the order of the timesteps required for evaluating the concentrations, so that ensemble averages at small time intervals are obtained. Yet this approach (if physically sensible, since it has to deal then with the stochastic equations as solved into the Lagrangian model) would not allow evaluation of the concentration fluctuations, for which more complex solutions have been studied (see, for instance, Thomson 1990; Mortarini and Ferrero 2005; Cassiani et al. 2007; Dixon and Tomlin 2007).

Since the goal of the UDINEE project is an evaluation of models in an emergency response, our MSS simulations use a diagnostic 5-min-averaged meteorological field constructed from observations, which implies, in particular, that the output concentration field is a function of the snapshot of the turbulence field given as input to the model, at particular points in space, and with fixed time resolution. As such, the variability included in the various input parameters (e.g., velocity standard deviations, third or higher moments, Lagrangian time scales, mean wind speed and direction) corresponds to “the best approximation” achievable in this context. However, it cannot reproduce all the stochastic variations at frequencies higher than those resolved by the snapshot, and is not representative of the instantaneous observed concentration recorded every 0.5 s during the field measurements and available to the UDINEE participants.

Another more pragmatic consideration accounting for the objective of modelling in emergency conditions is the forecast of reliable concentration fields in a time frame as short as possible. The concentrations in Lagrangian particle models are usually computed in three-dimensional cells as

$$ C_{i} = \frac{{N_{i} }}{{N_{p} }}\frac{Q\Delta t}{\Delta x\Delta y\Delta z}, $$
(1)

where Q is the tracer emission rate (kg s−1), Δt is the timestep (s), Ni is the number of particles in the i-th cell at each Δt, Np is the total number of particles emitted at each Δt, and ΔxΔyΔz is the volume of the i-th cell. This implies that, if the precision required for each concentration estimate is Cp (for instance, Cp = 0.1 μg m−3), then a single particle should bring an equivalent associated concentration, and the number of particles to be emitted to have the pre-fixed concentration precision is

$$ N_{p} = \frac{1}{{C_{p} }}\frac{Q\Delta t}{\Delta x\Delta y\Delta z}. $$
(2)

This implies that either one needs to emit millions of particles, thus contradicting the necessity of a short computational time, or one has to consider very large elementary cells for the concentration estimates, but this contradicts the necessity of very local predictions.

To fulfil the objectives of the UDINEE program with the MSS modelling system, we provide as output the 20-s averaged concentrations, which have then been interpolated to provide the requested outputs at 0.5-s intervals at the sampling stations. The configuration of the two components of the MSS modelling system is described hereafter.

3.1 MicroSwift Simulations

The same computational domain with dimensions 1600 m × 1400 m × 770 m is considered for each IOP simulation. The horizontal resolution of the meteorological grid is 5 m, as required for the UDINEE intercomparison, while the vertical meteorological grid has 57 points of higher resolution closer to the ground (about 2 m in the first 20 m), as suggested to the participants of the UDINEE exercise. This configuration allows a suitable reproduction of the obstacles seen by the modelling system as filled cells of the computational grid.

The input data are configured for MicroSwift simulations based on the standard meteorological pre-processor without any parameter optimization, as would be the case in emergency response. For the MW simulations, the incoming time-varying wind velocity and temperature at 10 m above ground level (a.g.l.) are extracted from observations at 40 m a.g.l. by the PWIDS15 sensor, which is located about 1 km upwind the area of interest, and averaged over a time interval of 5 min. Wind-speed data from the PWIDS15 sensor are extrapolated to 10 m a.g.l. using the logarithmic profile for the neutrally-stratified surface layer,

$$ u\left( z \right) = \frac{{u_{*} }}{k}{ \ln }\left( {\frac{z}{{z_{0} }}} \right), $$
(3)

where u is the wind speed, \( u_{*} \) is the friction velocity, k is the von Kármán constant, z is the height above the zero-plane displacement and \( z_{0} \) is the length roughness. Hence, the mean wind speed at 10 m a.g.l. is calculated as

$$ u\left( {10} \right) = u\left( {40} \right)\frac{{{ \ln }\left( {{10}/{{z_{0} }}} \right)}}{{{ \ln }\left( {{40}/{{z_{0} }}} \right)}}. $$
(4)

A value of 0.8 m is assigned to the roughness length \( z_{0} \), which is smaller than the value of 2 m suggested by Zhou and Hanna (2007), to account for the location of the PWIDS15 sensor 1 km upwind of downtown Oklahoma City, where the buildings are lower.

Stability conditions are considered neutral for each IOP for both day and night simulations. Hanna et al. (2007) suggested that the atmosphere in the JU2003 urban area is typically slightly unstable, so that a neutral atmosphere may be considered an appropriate approximation.

Vertical mean wind-speed profiles are reconstructed over all grid points of the MicroSwift domain using the power-law relationship,

$$ u\left( z \right) = u_{10} (z/10)^{0.25} , $$
(5)

where \( u_{10} \) is the wind speed at 10 m a.g.l. from (4), and the exponent is chosen to match neutral stability conditions in the complex urban environment of the downtown area under simulation. The wind direction is kept constant with height, and is set equal to the value measured by the PWIDS15 sensor. Vertical temperature profiles are considered to be dry adiabatic.

For all AW simulations, the wind speed and direction as averaged over the 80 available anemometers (at street level, roof level, in the suburbs, and nearby airports) are considered as representative for the incoming flow at 10 m a.g.l. Vertical profiles are extrapolated using the same method as in (5) for the MW simulations. Wind-velocity data for the AW simulations are averaged in time and kept constant during the simulation. All meteorological data cover a period of 90 min starting from the initial emission date and time described in Table 1 for each selected IOP.

The MSS system diagnoses the turbulence levels to be considered by the dispersion process as the sum of a background turbulence level and the turbulence inside the flow zones modified by the buildings. In the MicroSwift model, a scheme based on the local wind shear (deformation tensor) and the mixing length is used to derive the turbulence generated by the obstacles, namely the turbulent kinetic energy (TKE) and its dissipation rate ε, assuming equilibrium between the production and dissipation terms.

The production term P of the TKE is

$$ P = 0.5\nu_{t} \left( {\frac{{\partial U_{i} }}{{\partial x_{j} }} + \frac{{\partial U_{j} }}{{\partial x_{i} }}} \right)\left( {\frac{{\partial U_{j} }}{{\partial x_{i} }} + \frac{{\partial U_{i} }}{{\partial x_{j} }}} \right), $$
(6)

where Ui represents the ith component of the wind velocity generated by the MicroSwift model. The turbulent viscosity νt is evaluated using a mixing length Lm derived as a function of the minimum distance to buildings, and evaluated to obtain turbulence closure as

$$ \nu_{t} = L_{m}^{2} \sqrt {2\left( {\frac{{\partial U_{i} }}{{\partial x_{j} }} + \frac{{\partial U_{j} }}{{\partial x_{i} }}} \right)\left( {\frac{{\partial U_{j} }}{{\partial x_{i} }} + \frac{{\partial U_{i} }}{{\partial x_{j} }}} \right)} . $$
(7)

The local contributions to the velocity standard deviation due to the local turbulence \( \sigma_{{u_{i} }} \) are computed as

$$ \sigma_{{u_{i} }} = \frac{1}{2}\nu_{t} C_{0} \varepsilon^{0.25} , $$
(8)

with C0 = 2.3 (the Kolmogorov constant for the inertial subrange).

3.2 MicroSpray Simulations

The dispersion model calculates the contribution of the large-scale background turbulence using a scheme based on the boundary-layer parametrization of Hanna (1982). The local contribution given by the MicroSwift model is then added to the background contribution, calculating the total variances and the Lagrangian time scales as, respectively,

$$ \sigma_{{u_{i\,total} }}^{2} = \sigma_{{u_{i\,local} }}^{2} + \sigma_{{u_{i\,background} }}^{2} , $$
(9)
$$ T_{{L_{i} }} = 2\frac{{\sigma_{{u_{i\,total} }}^{.2} }}{{C_{0} \varepsilon }}. $$
(10)

The MicroSpray simulations share the computational domain used by the MicroSwift model, with the same horizontal and vertical resolution for the grid cells used to calculate the concentration. To represent an instantaneous emission, a release duration of 1 s is simulated for each IOP, with four emissions considered in a small area of 1 × 1 × 1 m3 centred around the given source position for each IOP. As indicated in Table 1, releases take place every 20 min. Particles are sampled every second in each grid cell to compute concentrations, which are then averaged every 20 s. In order to achieve a minimum threshold resolution of 1 μg m−3 for SF6 concentrations in the domain, a total number of about 106 particles are emitted.

4 Results and Discussion

In the following discussion, we focus on the concentration values and dispersion patterns predicted when using the two different input wind velocities for the MW and AW simulations. First, the concentration fields averaged over the entire simulation period of the single IOP are compared for the two cases, describing their main characteristics and differences. A qualitative comparison between the predicted concentration time series and observations is then presented to analyze the specific trends of the two alternative simulations in time and space. Finally, a more quantitative assessment of the differences between the two runs is presented based on a statistical analysis, considering both agreement and bias indicators to provide an estimate of the possible variability in the model outputs with the resulting uncertainty in the predicted field.

4.1 The Simulation Results

Figures 5, 6, 7 and 8 show the ground-level concentration fields averaged over a period of 90 min after the first release for IOP3, IOP5, IOP7 and IOP8, respectively, for both the MW and AW simulations, representing the average spatial pattern of the impact at ground level resulting from the emissions simulated by the dispersion model. As expected, the IOP3 case shows similar affected areas for the MW and AW simulations, with the area of the MW case slightly shifted towards the eastern side of the domain. Ground-level concentrations are larger for the AW simulation farther from the source because of the lower wind speed affecting the transport of the emitted plumes. Those samplers reached by the plumes are essentially the same in both runs even if differences are observed, which are related to the different complex channelling phenomena taking place in the two cases.

Fig. 5
figure 5

Mean concentration field for IOP3 for the MW (left) and AW (right) simulations averaged over 90 min after the start of the first release

Fig. 6
figure 6

As in Fig. 5, but for IOP5

Fig. 7
figure 7

As in Fig. 5, but for IOP7

Fig. 8
figure 8

As in Fig. 5, but for IOP8

The IOP5 case reveals a different situation, with the concentration patterns of MW and AW simulations directed towards the western and eastern sides of the domain, respectively, following the different wind directions. The MW simulation also shows the presence of a strong recirculation zone (not present in the AW case) close to the emission point, moving a significant part of the pollutant substantially upwind of the source. The plumes in the two cases mostly reach different samplers, and only a few receptors are within both areas affected by the concentration.

The MW and AW simulations for IOP7 are again in accordance with the initial wind field supplied to the two model runs. Farther from the source, the AW simulation tends to produce smaller concentrations, and the impact area is shifted more towards the eastern side of the domain. Several of the samplers are located at the borders of the affected area in both configurations, where concentrations are three or four orders of magnitude smaller than the maxima. Capturing such small and outlying values represents a further modelling challenge.

Finally, both MW and AW simulations in the IOP8 case show a pattern towards the west with respect to the source position, in accordance to the given flow direction. The higher concentration values cover a more extended area for the MW simulation, thus giving a larger impact zone than in the AW case. The impact area for the MW simulation is more shifted towards the west and the pollutant is channelled in the canyons oriented along the west–east axis, reaching all the available samplers. Two of the most western samplers are instead not exposed in the AW case, showing a pattern mainly oriented and channelled along the north–south axis.

4.2 Qualitative Comparison with Observed Data

We compare the concentrations simulated by the MSS modelling system with all single puffs for each IOP, and present a qualitative analysis of the modelling results compared with the observations in view of the effect of the different input wind fields on the characterization of the real puffs. The examination of the results has to be taken in a qualitative way because, as discussed above, we are comparing ensemble-averaged model outputs to field measurements corresponding to a single realization. Here, the values of field data are to be interpreted as individual snapshots from an ensemble. A significant variability characterizes the measured data, not only in the field experiments, but also in controlled conditions. In reproducing the JU2003 experiment in the wind tunnel, Harms et al. (2011) estimated that releasing 200 puffs in the same conditions gives a mean arrival time with an uncertainty of ±5%. The uncertainty increases to ±15% with 50 releases and to ±60% with four releases, as it is the case for the four puffs in each IOP of the field experiment. Therefore, the comparison with the observations presented in Figs. 9, 10, 11 and 12, and discussed hereafter, has to be considered as illustrative of one possible realization, keeping in mind such a degree of variability. In this sense, the analysis outcome is not representative of a paired comparison between predictions and observations.

Fig. 9
figure 9

Comparison between the observed puff concentrations (black) and the predicted values for the MW (green) and AW (red) simulations during IOP3

Fig. 10
figure 10

As in Fig. 9, but for IOP5

Fig. 11
figure 11

As in Fig. 9, but for IOP7

Fig. 12
figure 12

As in Fig. 9, but for IOP8

Concentration time series are shown for some samplers representative of the comparison between predictions and observations. Note that the scale of the concentration values may be rather different among the various sensors in the same IOP. The distances of the samplers from the source are reported in Table 3.

The input wind fields for the MW and AW simulations during IOP3 (Fig. 9) are characterized by similar wind directions, with wind speeds mostly higher for the MW case than for the AW case (Fig. 1), leading to similar patterns in the concentration fields (Fig. 5). At the sampler closest to the release point L13, the results are obviously less sensitive to the differences in wind direction, producing very similar values for the MW and AW simulations, where the predicted concentration is higher than that measured, probably because the pollutant is ‘trapped’ in the transversal canyon. This can be related to the presence of an unresolved elevated walkway in the simulation, which is instead modelled by a barrier extending down to the ground, thus modifying the structure of the local flow in a non-realistic way. At the next arc of samplers with respect to the release point, the concentration at the sampler L11 from the AW simulation is lower than for the MW simulation, with a visibly different behaviour due to channelling effects in the transversal canyon. The outputs from the MW simulation better capture the concentration of the observed puffs. At the L08 sampler, the arrival time is similar for both AW and MW simulations, and both concentrations are higher than observed, particularly in the AW case due to the lower wind speed. Both cases show an earlier arrival time at sampler L19 than that observed, particularly for the AW simulation due to the wind direction; the concentration peak is captured, especially in the AW case, even though the concentration values are low. Farther downwind, both cases tend to anticipate the arrival of the puff at the L01 sampler, but the timing is better for the samplers L05 and L17. At all these three samplers, the AW simulation produces higher concentrations than both the MW simulation and the measurements, since the lower wind speed slows the passage of the puff, thus increasing the pollution accumulation.

With small differences between the input wind velocities, the observed episodes are captured, but generally with an earlier arrival of the simulated puff at the sampler than the observed puff. Close to the release point, the AW and MW simulations provide, as expected, similar results, with the AW simulation tending to produce higher concentrations (lower wind speed, longer duration of the puff on the sampler), while the ranges of concentration for the MW simulation are in better agreement with the measurements. Far away from the source, due to a different interaction with the buildings, the agreement between simulated and observed concentrations tends to worsen, yet the transient events are correctly captured.

Large difference in the wind speeds and directions between the AW and MW simulations characterize the IOP5 case (Fig. 2), giving a different distribution of the concentration (Fig. 6). Examples of the comparison between predictions and concentrations at the samplers are given in Fig. 10. Of course, zero concentration at the eastern L11 sampler is found for the MW simulation, and only small non-zero values are recorded at sampler L13. The simulated puffs in the AW case capture those observed, in particular matching the small values measured at sampler L11. The simulated puffs arrive at sampler L19 for both the MW and AW cases, with a slightly earlier arrival time than that observed; the AW simulation gives concentrations higher than the observed values, also in this case due to a lower wind speed, which reduces puff movement and, thus, is sampled for longer. On the opposite side, no measurable concentration is found at the western samplers L02, L07 and L18 with the AW simulation. The simulated puffs in the MW case tend to arrive earlier than that observed at samplers L02 and L07, while the timing is more similar for the sampler L18. The deviation of the puff due to the buildings downwind results in higher simulated concentrations than those observed.

While all samplers measured the puffs, the observed episodes are captured alternatively by the MW and AW simulations depending on the position of the sampler with respect to the different wind direction. Close to the release, only the plume of the AW simulation arrives at the sampler. At the sampler placed a medium distance from the release point and in an intermediate direction with respect to the wind directions in the MW and AW simulations, more similar results are expectedly obtained. Again, the AW simulation tends to produce concentrations higher than those measured (longer duration of the puff at the sampler). Generally, as the pattern in the MW case is more similar to that observed, it is expected to be more representative of the actual flow conditions.

As for the IOP3 simulations, similar wind directions characterize the IOP7 simulations, but in this case the wind speed is lower in the MW simulation than in the AW simulation (Fig. 3), and similar patterns describe the distribution of the concentration (Fig. 7). Examples of the comparison at the samplers are given in Fig. 11. In both the MW and AW simulations, the plumes miss the more external eastern samplers L21, L30 and L29, where instead some low concentration values are recorded in the field experiment, indicating that the simulated plume turns more towards the north-west than revealed in the observations. At all other sensors, the predictions show an arrival timing similar to that observed. The passage of the puffs is captured at the samplers closer to the release point, with higher values at the sampler L11, and smaller values at the samplers L28 and L31, which is possibly caused by a slightly different anti-clockwise shift of the simulated plume with respect to the real one. For the same reason, the predicted concentrations tend to be higher at the L24 sensor than those observed, while lower values than those measured are predicted at the samplers L23 and L26. In general, the quantities obtained from the MW and AW simulations are in agreement for the puff arrival time, the duration, and mostly also for the predicted concentration values. The simulated puffs miss the eastern sensors, but the observed values are also relatively small here in the actual measured range (on the order of 103 ppt).

Consistent with IOP5, as large differences in the wind direction and speed, which is lower for the MW simulation (Fig. 4), characterize IOP8, the concentration distributions show rather different patterns, especially farther from the source (Fig. 8), as shown in Fig. 12. The predicted timing in the arrival and duration of the observed puffs is similar to that observed at points closer to the release. Higher concentration values than those observed are found for the MW simulation at sampler L19 and for the AW simulation at sampler L08, for obvious reasons due to the wind direction. The predictions in the AW case reproduce the observed behaviour at the sampler L01, while the MW simulation shows lower concentrations than measured, and a later puff arrival. The passage of the puffs at sampler L02 is mostly well captured by the AW scenario, while the MW simulation shows a larger variability and higher concentration values than observed. The predictions of the AW simulation match the observations at the samplers L05 and L17, implying that, in this case, the wind-field input used in the AW case is representative of the actual flow, in contrast to the result for IOP5.

Summarizing, when similar wind fields are provided as input, the AW and MW simulations produce comparable results, as in the IOP3 and IOP7 cases. As the difference in the wind velocity used as input does not lead to a high level of criticality in the representativeness of the simulation outputs, both input conditions may be used with a certain confidence.

Clearly, in the IOP5 and IOP8 cases, the two simulations produce rather different outputs because of the large difference in the input wind velocities. In particular, depending on the case and on the sampler considered, the two AW and MW simulations alternately provide better performances in correctly representing the observed puffs. Therefore, a univocal response cannot be achieved, and the ‘uncertainty’ related to the variability of the possible model inputs is reflected in the variability of the model outputs. In order to evaluate and quantify the level of agreement or the bias related to the use of the alternative input fields, which can be both considered as actual reference conditions for the model simulations, a statistical analysis is presented next.

4.3 Assessment of the Uncertainty

A statistical analysis of the differences between the simulated concentrations using the two velocities as input to the model has been performed. We consider the pairs of values only when at least one of the two simulations produced a non-zero concentration. Pairs for which both runs predict zero concentrations are thus excluded, which makes the statistical evaluation more severe, since the cases where the runs agree in recording the absence of the puff at the receptor are not counted.

For each IOP, we analyze the concentrations at all receptors together, as well as for the separated receptors, for selected metrics. In order to quantify the deviation between the two simulations, we calculate the fractional bias (FB), the normalized absolute difference (NAD), and the index of agreement (IA, Doran and Horst 1985),

$$ F\!B = 2\frac{{\overline{{C_{M} }} - \overline{{C_{A} }} }}{{\overline{{C_{M} }} + \overline{{C_{A} }} }}, $$
(11)
$$ N\!A\!D = \frac{{\overline{{\left| {C_{M} - C_{A} } \right|}} }}{{\overline{{C_{M} }} + \overline{{C_{A} }} }}, $$
(12)
$$ I\!A = 1 - \left[ {\frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {C'_{Mi} - C'_{Ai} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{N} \left( {\left| {C'_{Mi} } \right| + \left| {C'_{Ai} } \right|} \right)^{2} }}} \right], $$
(13)

respectively, where CM and CA represent the concentrations calculated from the MW and AW runs, respectively, and \( C'_{Ki} = C_{Ki} - \overline{{C_{K} }} \) (K = M, A) are the deviations for each single ith predicted value. A perfect agreement between the two simulations would give IA = 1, NAD = 0 and FB = 0. A value of NAD = 1 implies that the two plumes never overlap, while the metric FB is based on the mean bias, making it possible that, even if the predictions of the two runs are completely out of phase, their FB value may still be zero. However, the metric FB allows an evaluation of which of the two simulations produces larger concentration values.

Considering all receptors together (Table 2), for each IOP, the concentration for the AW simulation is on average larger than for the MW case, as revealed by the negative FB values.

Table 2 Statistics for each IOP considering pairs at all receptors together

Clearly, the IA value is improved for the runs with the least difference between the two input wind directions (IOP3 and IOP7). While for IOP3 the agreement is good, the IA value for IOP7 is relatively low, because for both the MW and AW cases, many of the receptors in IOP7 are displaced with respect to the main wind direction, thus the puffs barely reach the samplers as revealed by the small concentration values, also leading to small IA values.

The NAD values confirm the good accordance for the AW and MW simulations in IOP3 and IOP7, with the limitations just discussed. The NAD values also represent an appropriate index to quantify the disagreement in the case of different wind fields used as input, which are greater for IOP5 than for IOP8, as also appreciable in the patterns of the concentration distribution (Figs. 6, 8).

The values of the metrics calculated for each single sampler and reported in Table 3 confirm the discussion in Sect. 3, helping direct identification of the samplers, and related areas, where the two plumes from the MW and AW simulations do not overlap at all (IA = 0, FB = ±2 and NAD = 1), or where they show similar concentrations.

Table 3 Statistics for the single IOP considering pairs at single receptors

The values of the indexes quantifying the deviation between the simulations driven by alternative velocity inputs can be compared to the typical values obtained or expected when assessing simulated data against observations, and to the acceptance criteria. As a reference, the acceptance criteria for urban modelling proposed by Hanna and Chang (2012) are |FB| < 0.67 and NAD < 0.5 (calculated as threshold-based normalized absolute difference). For instance, when evaluating Lagrangian models against observed concentrations for the JU2003 field experiment, Hanna et al. (2011) estimated |FB| values ranging from 0.4 to 1.3, while Brown et al. (2013) found |FB| values ranging from 0.1 to 0.7. Hence, the bias caused by the variability, and thus by the ‘uncertainty’, in the possible wind velocity input initializing the simulations can be of similar magnitude to the bias between predictions and observations.

In order to evaluate whether one of the two inputs used for the MW and AW simulations is systematically providing a better agreement with the observations, Fig. 13 reports scatter plots of the predicted versus observed dosages, which are estimated by calculating the time-integrated concentrations for each puff and sampler from the concentration time series as the sum of the products of the values and the timesteps of 0.5 s and 20 s for the observations and predictions, respectively. Clearly, when the two wind directions are similar (IOP3 and IOP7), the scatter and the agreement between the simulations and the observations are comparable. In IOP3, both the MW and AW simulations miss the same sampler in one instance, while the puffs miss the samplers 12 times for the MW case and 14 times for the AW case in IOP7. In the cases where the input wind velocity differs, the puffs miss the samplers 11 times for IOP5 and 19 times for IOP8 in the AW simulations, and three times for each IOP in the MW simulations. This aspect has to be considered when interpreting the scatter plot, which does not include the zero values because of the logarithmic scale. The performance of the model simulation in matching the observations is better using the wind velocity input for the MW simulation than the AW simulation, which gives results less representative of the local flow field. Given a single wind velocity as input, the MicroSwift module reproduces the spatial variability of the flow field for both MW and AW cases. Hence, this result suggests that it is important to provide a time-varying input, as for the MW simulations, to capture properly the non-stationarity of the atmospheric processes, particularly within such complex geometry. In principle, a flow field based on the analysis of data from an observational network may be expected to be more representative than a single measurement. In this case, a simple average was applied to determine the wind velocity for the AW simulation, with results not always appropriately reproducing the actual conditions. We add that when dealing with emergency responses, the time variable plays a key role, and a compendium and processing of datasets may not be feasible. However, it would be improper to generalize the result of our analysis, since a final statement requires a vaster number of simulations representing a larger spectrum of possible cases.

Fig. 13
figure 13

Scatter plots of the dosage estimates for MW (red) and AW (blue) wind-field inputs. Left: IOP3 (crosses) and IOP7 (circles); right: IOP5 (crosses) and IOP8 (circles)

Table 4 reports, for example, the NAD values calculated between the measured dosages and the simulated ones, compared with the NAD values between the two predicted dosages from the MW and AW simulations. The statistics for the cases where the two wind-velocity inputs differ (IOP5 and IOP8), confirm that the deviation between the simulations can be even larger than the bias calculated with respect to the measured values.

Table 4 Values of NAD for the dosages (Fig. 13) calculated as predictions from the MW and AW simulations versus observations (Obs.) and AW versus MW simulations

5 Conclusions

The present work has been performed in the framework of the UDINEE project, addressing the capabilities of numerical dispersion models to reproduce puff releases in an emergency-response context with reference to the JU2003 field experiment. We have carried out simulations of IOP 3, 5, 7 and 8 to assess the effect of varying the wind-speed profile used as model input on the correct reproduction of the concentration pattern.

We have considered two alternative wind velocities to initialize the model runs, including those measured at the PWIDS15 sensor for the MW simulation as used in the intercomparison exercise, and those averaged for the full IOP considering all meteorological sensors in the experimental domain for the AW simulation. Of the four IOPs selected for simulation, two had similar wind directions for both simulations, and lower (IOP3) or higher (IOP7) wind-speed inputs to the AW simulation than those used in the MW case. The other two had different wind directions as well as wind speeds, being lower (IOP5) or higher (IOP8) for the AW simulation than those for the MW case. This approach enabled the consideration of a variety of possible situations. The patterns of the concentration distributions for each IOP have been compared and discussed, and a qualitative analysis comparing predictions against the observational data has also been proposed. Based on a statistical analysis, we have estimated the possible bias in the model results related to using different, but analogously valid, flow data as inputs for driving the dispersion simulations.

As expected, the suitability and reliability of the wind-velocity profile for input to the modelling system determines the success of the simulation in capturing the puff dispersion, the concentration distribution and the affected area. In particular, we verified that, in a built environment, even small differences in the input flow may lead to relatively large deviations between the puff patterns among different simulations and with respect to the observed puffs. However, one is likely to have only one wind-velocity dataset available for driving model simulations in practice, which may often be unrepresentative of the variation in space and time of the actual flow field, especially when dealing with complex urban areas. Moreover, in the real atmosphere, due to the intrinsic variability caused by turbulence, puffs released in the same condition can take many different trajectories. This occurs even in conditions such as wind-tunnel experiments, where the generated turbulent flow is known and controlled. Overall, when using an appropriate input wind-velocity field, the model simulations are able to correctly describe the plume dispersion and to predict the distribution and range of the concentrations, even in such complex built environments and for transient releases.

Having an estimation of the possible deviation and resulting bias related to the approximation of the flow data supplied as input to the model simulation can, therefore, provide information on the variability and uncertainty of the simulated concentration. We find that the bias between two concentration fields related to the input flow has a similar magnitude to the typical bias between model predictions and observations. This issue certainly has a crucial impact when dealing with typical emergency-response conditions, where information on the atmospheric conditions may be poor, inadequate or even unavailable. It would thus be important to account for the bias on the concentration pattern and affected area for assuring a conservative approach when interpreting the results of the simulations in responding to an emergency. This, in principle, could be done by providing the predicted concentration values together with the estimated ‘error’. In our case, after considering all simulations together and averaging, the bias expressed as the NAD value is about 0.7 for the predicted concentrations (see Table 2) and 0.5 for the predicted dosages (Table 4). These values can be considered plausible, since they represent cases with small (IOP3 and IOP7) and large (IOP5 and IOP8) variability in the input flow data and the related uncertainty. However, to propose a more established approach for addressing this problem, a methodology should be developed and ascertained based on a thorough analysis of several cases in similar conditions.