1 Introduction

Volcanic ash is one of the most common products of explosive volcanic eruptions and may result in a widespread, disruptive hazard that can have a significant impact on human life and the economy at a variety of different temporal and spatial scales. For example, in 2010, the eruption of Eyjafjallajökull in Iceland caused the closure of both European and North Atlantic airspace with estimated economic losses of US$5 billion (Bonadonna et al. 2012); in 2002, Reventador erupted resulting in the deposition of 3–5 mm of ash, causing the closure of the international airport in Quito of Ecuador for 8 days (Guffani et al. 2009); and in 1996, diffuse volcanic ash from Mount Ruapehu, New Zealand, may have contributed to the significant increase in mortality observed in Hamilton located around 166 km away from the volcano (Newnham et al. 2010).

Volcanic ash transport and dispersal models (VATDMs) are used to simulate the transport of volcanic ash in the atmosphere and/or ash deposition at ground levels, either in offline for research purposes or in real time using a given set of meteorological conditions provided by numerical weather prediction models (NWPMs) (Folch 2012). VATDMs can be based on three different approaches for specifying the flow field: Eulerian, Lagrangian or Hybrid. Such models have the potential to not only aid our understanding of such events, but help predict the passage of ash clouds during an eruption, facilitating steps to be taken to warn populations and potentially mitigate impacts.

Currently, a set of default eruption source parameters [such as the eruption duration, the mass fraction of small particles (diameter <63 μm), the plume height and the mass eruption rate], based on eleven eruption types, is often used for VATDMs to simulate volcanic ash clouds (Mastin et al. 2009). While these may be appropriate in many situations, the sensitivity of model outputs to the eruption source parameters chosen is often not clearly stated; it is unclear which parameters are the most important, and it is also unknown whether the set of eruption source parameters is complete, for ensuring accurate modelling results.

This study simulates ash clouds based on the documented eruption source parameters (where available) and compares these with satellite data using a statistical verification method based on the Lagrangian particle dispersion model FLEXPART (Stohl et al. 1998) coupled with the Weather Research and Forecasting (WRF) model (Michalakes et al. 2001). A case study approach is taken using the well-documented 16–17 June 1996 eruption of Mount Ruapehu, New Zealand. Having established model performance, the paper then examines the sensitivities of the FLEXPART model outputs to the variation of a set of eruption source parameters, specifically, the particle size distribution, the plume height and the plume ratio (defined as the ratio of the thickness of the laterally spreading ash cloud at the plume top to the height of the plume).

The family of Lagrangian VATDMs is widely used for predicting ash clouds for civil aviation safety purposes (Folch 2012). For example, except for the FLEXPART model, previous studies have also used the Japan Meteorological Agency (JMA) model (Iwasaki et al. 1998), the Modèle Lagrangien de Dispersion de Particules d’ordre zéro (MLDP0) model (D’Amours and Malo 2004), the Numerical Atmospheric dispersion Modelling Environment (NAME) model (Jones et al. 2007) and the PUFF model (Searcy et al. 1998). However, the FLEXPART model, unlike many Lagrangian VATDMs, is open-access software. A range of different types of volcanic eruptions has been successfully modelled using the FLEXPART model in different parts of the world, including for the 2010 Eyjafjallajökull eruption in Iceland [with output validated using a variety of observational data sources, including those from satellite, lidar, aircraft as well as in situ measurements (e.g. Stohl et al. 2011; Miffre et al. 2012; Perrone et al. 2012)]. While most simulations were carried out for volcanoes in the Northern Hemisphere, volcanoes in the Southern Hemisphere were also studied using this model [e.g. the 2011 Puyehue eruption (Theys et al. 2013)].

The 16–17 June 1996 eruption of Mount Ruapehu formed part of the largest eruption (the 1995–1996 eruption) in New Zealand’s recent volcanic history. The June 1996 eruption event was considered to be a volcanic explosivity index (VEI) 3 “moderate” event (Newhall and Self 1982; Mastin et al. 2009). During this period, a large anticyclone stagnated across New Zealand’s North Island (see Fig. 1). The prevailing winds resulted in the transport of the ash cloud over Hamilton as well as Auckland, New Zealand’s largest urban centre located some 300 km away. Media reports indicate that concentrations of ash in the air were sufficiently high to be visible in the distance according to a ground report. The event forced the closure of most airports in the North Island for several days, including the Auckland International Airport, halting most international flights in and out of New Zealand (Johnston et al. 2000). The respiration of volcanic ash [a known hazard to human health (Horwell and Baxter 2006)] by residents of Hamilton (a city located 166 km from the volcano) may have contributed to the significant increase in admissions to hospital in the weeks following the event (Newnham et al. 2010).

Fig. 1
figure 1

Map of New Zealand showing the geopotential heights based on National Centres for Environmental Prediction (NCEP) 0.5° × 0.5° Climate Forecast System Reanalysis (CFSR) data at 1000 hPa (a) and 700 hPa (b) at 0000 UTC 18 June 1996

The 16–17 June 1996 eruption event of Mount Ruapehu has previously been studied using the ASHFALL (Hurst and Turner 1999) and TEPHRA (Bonadonna et al. 2005) models. The ASHFALL model is derived from an early version of the HAZMAP model (Macedonio et al. 2005) which can be used to simulate the ground load deposit for a given wind profile; the TEPHRA model incorporates several modifications to the original version of the HAZMAP model, including a different treatment of the source term, a particle size-dependent diffusion law, and a parallelization of the code (Folch 2012). Blocks (>64 mm), lapilli (2–64 mm) and coarse ash (<2 mm) were deposited from the rising stage of the eruption column, while the horizontal dispersion was characterised by coarse and fine ash (Bonadonna et al. 2005). The main factor identified which affected the modelled ash fall distribution from this eruption event was the accuracy of the wind direction forecast, while the quantity of ash fall downwind was found to be dependent on the volume of the eruption (Hurst and Turner 1999; Turner and Hurst 2001). Scollo et al. (2008) suggested that, for this event, the total erupted mass had a significant impact on the predicted ash fall, while the particle density had negligible impact on the model outputs. However, these studies focused specifically on the deposition of tephra rather than on the dispersion of the ash cloud. In addition, the ASHFALL and TEPHRA models that are used for tephra deposit modelling are not suitable for fine ash simulation, unlike the FLEXPART model (Folch 2012).

2 Geological settings of the Mount Ruapehu 16–17 June 1996 eruption

The 1995–1996 eruptive sequence at Mount Ruapehu (see Fig. 2 for the geographical location) consisted of two distinct periods of eruptive activity: the first from 17 September 1995 until early November 1995 and the second from 16 June 1996 until late July 1996 (Bryan and Sherburn 1999). The sequence began with a series of small-to-moderate phreatomagmatic explosions in September 1995, followed by a series of “dry” explosive magmatic eruptions from mid-October 1995, after the volcano’s Crater Lake had been removed. Activity then paused until the third major ash eruption of the sequence at 1900 UTC 16 June 1996 (Turner and Hurst 2001).

Fig. 2
figure 2

Geographical locations of Mount Ruapehu and the nine selected meteorological stations

The 16–17 June 1996 activity resulted in an andesitic subplinian eruption (Bryan and Sherburn 1999; Hurst and Turner 1999; Cronin et al. 2003), with two major eruption pulses occurring between 2030 UTC 16 June and 0100 UTC 17 June and between 0300 UTC and 0500 UTC 17 June (Turner and Hurst 2001). These eruption pulses produced a bent-over plume in a strong wind field of 15–35 m s−1 (Prata and Grant 2001; Bonadonna et al. 2005). The wind direction changed gradually from south-west to south (SW–S) over the course of the eruption, resulting in a large amount of ash deposition in the north-easterly to northerly (NE–N) direction from the source to greater than 300 km from the volcano, producing an estimated 5 million m3 of tephra fallout over the period of 16–17 June 1996 (Cronin et al. 2003). The details of this eruption scenario are summarised in Table 1.

Table 1 Documented eruption information for the 16–17 June 1996 Mount Ruapehu eruption event

3 Meteorological modelling and verification

The WRF model was configured with 347 × 359 horizontal grid points at 3-km grid spacing. This case was initialised based on the previous 6 h of free forecasts (i.e. no updated boundary layer information and no observation incorporated) from WRF (the free forecasts were initialised with NCEP 0.5° × 0.5° CFSR data). Various physical parameters were used in the WRF model to take into account the evolving meteorological conditions during the 16–17 June 1996 Mount Ruapehu eruption event for New Zealand according to Zhang et al. (2014): the Kessler scheme (Kessler 1969) for microphysical processes, the Dudhia scheme (Dudhia 1989) for short wave radiation, the Rapid Radiative Transfer Model scheme (Mlawer et al. 1997) for long-wave radiation and the Yonsei University scheme (Noh et al. 2003) for planetary boundary layer parameterization.

The WRF model outputs were compared with all of the available surface meteorological observations from 1900 UTC 16 June to 0800 UTC 17 June at selected stations, namely Whangarei, Auckland, Tauranga, Hamilton, Gisborne, Taupo and New Plymouth (see Fig. 2 for the geographical locations). Pearson’s correlation coefficient was used in this study to compare WRF model outputs with observations. The 2-m temperature and water mixing ratio, and the 10-m wind speed and wind direction from the WRF model outputs were compared with surface observations at the closest grid point.

Table 2 gives the Pearson’s correlation coefficient statistics between simulations and observations. The modelled and observed meteorological variables are presented in Fig. 3. In general, the modelled temperature showed good agreement for all of the meteorological stations. For the water mixing ratio, except for Gisborne, strong correlations were found. For the wind speed, except for Hamilton and Taupo, good correlations were found, and for wind direction, except for Hamilton, New Plymouth and Taupo, good correlations were observed. In addition, the WRF simulations significantly overestimated the temperature and wind speed of the selected stations (see Fig. 3a, c). In summary, the WRF simulations were most reliable for the temperature, while less reliable for the water mixing ratio, wind speed and wind direction. It is notable that the temperature and wind speed are consistently overestimated by the model.

Table 2 Pearson’s correlation coefficients between observed and WRF-simulated values of meteorological observations for the period between 1900 UTC 16 June to 0800 UTC 17 June 1996 of the selected stations
Fig. 3
figure 3

Scatter plots of the observed and WRF predicted temperature (a), water mixing ratio (b), wind speed (c) and wind direction (d) for the period between 1900 UTC 16 June to 0800 UTC 17 June 1996 of the selected surface stations

In addition, the WRF model outputs were compared with radiosonde meteorological data at Whenuapai, New Plymouth, Paraparaumu and Gisborne (see Fig. 2 for the geographical locations). The temperature data were obtained from Whenuapai and Paraparaumu, and the wind data were derived from New Plymouth and Gisborne. No radiosonde humidity data were collected. In general, the temperature showed good agreement with radiosonde meteorological data at both stations. However, a deviation was seen from the surface to near 3 km in the upper air at 1200 UTC 17 June at Whenuapai (Fig. 4b). Wind direction simulations were less accurate than temperature simulations (Fig. 4d); wind speed simulations at Gisborne and New Plymouth were smaller and larger, respectively, than those of the observations (Fig. 4c).

Fig. 4
figure 4

Comparisons between the radiosonde and modelled temperature at 0000 UTC (a) and 1200 UTC (b) 17 June 1996 at the Whenuapai and Paraparaumu stations and the wind speed (c) and wind direction (d) at 0000 UTC 17 June 1996 at the Gisborne and New Plymouth stations

At the time of the eruption, a high-pressure system dominated the Tasman Sea to the west of New Zealand’s North Island (see Fig. 1). Advection by SW–S winds resulted in the ash cloud being transported towards the NE–N from its volcanic source and dispersing out to sea. The ejected column was bent over as a result of the strong wind aloft, compared with the vertical velocity imparted to the volcanic ash as a result of the eruptive process. Figure 5 shows the wind field at the 400 and 700 hPa pressure levels, simulated separately using the WRF model for the 16–17 June 1996 Mount Ruapehu events. The modelled wind speeds immediately above the volcano at this time ranged from 15 to 35 m s−1, consistent with the observed wind speeds. In addition, a wind shear can be seen between the 400 and 700 hPa pressure levels (see Fig. 5c, d); wind blew towards the north-east at high levels, while it blew to the north at low levels.

Fig. 5
figure 5

Simulated 400-hPa-pressure-level wind fields at 2100 UTC 16 June 1996 (a) and 0300 UTC 17 June 1996 (b) and 700-hPa-pressure-level wind fields at 2100 UTC 16 June 1996 (c) and 0300 UTC 17 June 1996 (d) (black star represents the location of the Ruapehu volcano)

4 Ash dispersal modelling

For the purpose of this study, all simulations produced by the FLEXPART model were carried out using forward modelling. The horizontal resolution used was the same as that used in the WRF model (i.e. a 3-km resolution grid). The input parameters and other specifications used in the FLEXPART model are summarised in Table 3. The number of particles needed in a FLEXPART simulation depends mainly on the size of the research domain, the resolution of the meteorological input and the FLEXPART output and the distribution of the sources (Brioude et al. 2013). In this study, it was assumed that 10,000 particles were released during each 1-h eruption period. It is notable that the particles are not intended to represent individual ash grains, but rather are computational Lagrangian elements that represent a collection of grains with a specified size distribution. In this study, the particle size distribution, plume height and plume ratio were considered in the sensitivity experiments to investigate the uncertainties associated with these parameters with respect to volcanic ash cloud modelling. Uncertainties associated with other eruption source parameters (i.e. eruption duration, particle density and mass eruption rate) were not analysed and considered as “true” values. A simple method in which each of the input factors is varied one at a time was utilised. This approach has been previously used for ash dispersal simulations (e.g. Hurst and Turner 1999; Bonadonna et al. 2002; Webley et al. 2009). The sensitivity experiments carried out for the 16–17 June 1996 Mount Ruapehu eruption event are summarised in Table 4.

Table 3 FLEXPART model input parameters and other specifications
Table 4 FLEXPART sensitivity studies for the 16–17 June 1996 Mount Ruapehu eruption

4.1 Eruption duration

Discontinuous eruptive activity between 1900 UTC 16 June and 0100 UTC 17 June (6-h duration) and between 0300 UTC 17 June and 0500 UTC 17 June (2-h duration) was adopted for the eruption duration for ash cloud transportation modelling (see Table 1).

4.2 Particle density

Particle density is typically measured in the laboratory for fragments down to 2 mm in diameter. The density of smaller fragments can be obtained using the simple parameterization from Bonadonna and Phillips (2003) in which the density of pumice is assumed to decrease with increasing size for particles of diameter below 2 mm becoming equal to the lithic density when the particle size is smaller than 0.0078 mm (Scollo et al. 2008). In this study, the particles in each size fraction in the model simulation were assigned a density taken from a uniform distribution with a range of 1100 kg m−3 (for pumice) to 2650 kg m−3 (for lithic fragments), as recommended for this eruption by Bonadonna et al. (2005). It was also assumed that the density values were distributed uniformly in the model for each specified size distribution.

4.3 Mass eruption rate

The mass eruption rate can be inferred from the column height in VATDMs. Two empirical equations have been established relating the observed column height above the volcano and the mass eruption rate by Sparks et al. (1997) and Mastin et al. (2009). However, the empirical relationships ignore the influence of the atmospheric conditions, such as the wind velocity. Thus, they may not be suitable for weak plume eruptions in windy conditions. Therefore, a semi-empirical relationship between the column height and the mass eruption rate from Woodhouse et al. (2013) that explicitly contains the atmospheric wind velocity was used in this study to calculate the mass eruption rate for this eruption event.

4.4 Particle size distribution

The actual particle size distribution produced in an eruption event depends on the fragmentation mechanisms at play (Rust and Cashman 2011). With regard to the total particle size distribution, analysis of the Mount Ruapehu 16–17 June 1996 ash deposit by Bonadonna and Houghton (2005) using the Voronoi tessellation technique showed that the fraction of small particles (diameter < 63 μm) was between 4 and 20 %. In this study, three particle size distributions were considered in order to investigate the impact of particle size on ash cloud modelling, namely PSD1 (Webley et al. 2009), PSD2 and PSD3. PSD1 was defined as 10 % at 4 μm, 20 % at 8 μm, 40 % at 16 μm, 20 % at 31 μm and 10 % at 62.5 μm. PSD2 and PSD3 varied the proportions for different particle sizes based on PSD1. PSD2 was defined as 6.3 % at 4 μm, 12.4 % at 8 μm, 25 % at 16 μm, 50 % at 31 μm and 6.3 % at 62.5 μm. PSD3 was defined as 3.2 % at 4 μm, 6.4 % at 8 μm, 12.9 % at 16 μm, 25.8 % at 31 μm and 51.7 % at 62.5 μm (see Fig. 6). Particles of size larger than 62.5 μm were not modelled for the purpose of the long-range dispersion of ash particles.

Fig. 6
figure 6

Particle size distributions of PSD1, PSD2 and PSD3

4.5 Plume height

The horizontal structure of an ash cloud is largely affected by the height to which ash particles are ejected from a volcano into the atmosphere (Tupper et al. 2009; Webley et al. 2009). Therefore, an accurate estimation of plume height is needed for reliable ash cloud modelling. In this study, plume heights were retrieved from satellite infrared images from Geostationary Operational Environmental Satellite (GOES) using the method of Prata and Grant (2001); the ash clouds are identified as the regions where negative band 4 minus band 5 brightness temperature deviations have been detected (see Sect. 5 for the details of the method). The nearby radiosonde temperature profiles (here the temperatures at 0000 UTC and 1200 UTC 17 June at the Whenuapai and Paraparaumu stations; see Fig. 3) were then used to detect the height at which there is the best match with the top of the ash cloud in the temperature profiles. The satellite-retrieved plume heights at 0300 UTC 17 June were found to be between 6.5 and 7.5 km. Plume heights of between 7.5 and 8.5 km were detected by Prata and Grant (2001) using satellite data from the AVHRR-2 instrument. In order to investigate this inconsistency, plume heights of 6.5 km (PH1), 7.5 km (PH2) and 8.5 km (PH3) were used in this study.

4.6 Plume ratio

In this study, the mass inside the plume was considered to be distributed uniformly along the eruption column where the top and bottom of the particle source area are defined by the plume ratio (β) to specify the depth (∆Z) of the eruption column. If Z is the height from the summit of the volcano to the observed maximum plume height, then ∆Z equals βZ (see Fig. 7). A relatively weak plume can be significantly affected by the surrounding winds, and if the winds are sufficiently strong, the eruption column may assume a bent-over shape (Bursik 2001). Devenish et al. (2012) noted that ∆Z can be estimated to be equal to two-thirds of Z based on the May 2010 Eyjafjallajökull ash dispersion modelling. For the purpose of this study, plume ratios of 1/3 (PR1), 2/3 (PR2) and 1 (PR3) were considered for investigating the impact on ash cloud modelling.

Fig. 7
figure 7

Model of a bent-over volcanic plume under windy conditions. Z is the height from the volcano’s summit to the maximum plume height; ∆Z is the depth of the eruption column equalling βZ (β is plume ratio)

5 Satellite retrievals

GOES operates at the geosynchronous altitude of 35,800 km above the Earth’s surface. GOES has five spectral channels (see Table 5), and the scan mirror of GOES needs to only scan 10° from the subsatellite point to view of the entire hemisphere (Kidder and VonderHaar 1995).

Table 5 GOES-9 band specifications

All satellite data were projected in “Lambert” coordinates and gridded into Cartesian coordinates for the areas of interest. The data were then interpolated using the nearest neighbour method to match the spatial resolution of the FLEXPART simulations. To coincide with satellite data time, FLEXPART produced 30-min averaged outputs at the nearest moment to the satellite time shown in Table 6.

Table 6 GOES-9 observational times after the initial eruption at 1900 UTC 16 June 1996

The differences between band 4 and band 5 brightness temperature deviations were used to distinguish volcanic ash clouds from meteorological clouds; volcanic ash clouds display negative band 4 minus band 5 brightness temperature differences, while meteorological clouds have positive band 4 minus band 5 brightness temperature differences (Prata 1989). Ideally, the threshold for the brightness temperature difference should be zero; however, due to calibration uncertainties, the threshold can be chosen to be in the range of −0.5 to +0.5 Celsius (Prata and Grant 2001). In this study, the threshold for the brightness temperature difference from GOES band 4 and 5 was chosen to be −0.3 Celsius. The available satellite data that were used in this study are shown in Table 5. Following the method of Prata (1989), we used the standard calibration techniques to convert the raw sensor counts from GOES-9 to radiance values which were then converted to brightness temperature by inverting the Planck function (Planck 1914).

6 Comparisons of satellite retrievals with FLEXPART simulations

The probability of detection (POD), false alarm ratio (FAR) and Critical Success Index (CSI) were used to compare the FLEXPART simulations with the satellite retrievals. The CSI, also known as the threat score, was applied to investigate the match between the simulated and observed ash clouds retrieved from satellite data in the area of interest (Stunder et al. 2007; Webley et al. 2009). The primary tool for calculating the POD, FAR and CSI is the 2 × 2 contingency table (Table 7) proposed by Schaefer (1990). It is notable that, in this study, POD, FAR and CSI are time-dependent functions.

Table 7 Four-cell contingency table for comparing FLEXPART simulations with satellite retrievals

POD is the ratio of the number of pixels at which ash is predicted to occur and is observed in the satellite image to the total number of pixels at which ash is detected in the satellite image. FAR is the ratio of the number of pixels at which ash is predicted to occur while no ash is observed in the satellite image to the total number of pixels at which ash is predicted to occur.

$${\text{POD}} = A/\left( {A + B} \right)$$
(1)
$${\text{FAR}} = C/\left( {A + C} \right)$$
(2)

CSI is defined by Eq. (3). The term (B/A) is interpreted as a penalty when the model misses ash where it is observed, and the term (C/A) is interpreted as a penalty for predicting ash where there is none observed.

$$1 + {\text{CSI}} = 1 + \left( {B/A} \right) + \left( {C/A} \right)$$
(3)

For a perfect match, FAR is equal to zero, while both the POD and CSI equal one.

7 Results and discussion

7.1 Sensitivity of the model to eruption source parameters

Figure 8 provides the results of POD, FAR and CSI for the sensitivity studies. Overall, the POD and CSI scores were both comparably high before (including) 0600 UTC 17 June after the initial eruption at 1900 UTC 16 June; the FLEXPART model performed well within this period with relatively small FAR scores. However, the errors in the FLEXPART model increased over time (i.e. smaller POD and CSI, and larger FAR scores compared with previous time periods), such that the FLEXPART simulations overestimated the ash cloud area where there was no ash detected from satellite data.

Fig. 8
figure 8

POD (left panel), FAR (middle panel) and CSI (right panel) scores for sensitivity studies as a function of plume height (ac), plume ratio (df) and particle size distribution (gi), respectively, of the 27 individual model runs (Table 4)

The sensitivity study results indicate that the three plume heights of 6.5, 7.5 and 8.5 km and the three particle size distributions of PSD1, PSD2 and PSD3 with a plume ratio of 1 showed higher scores in POD and CSI analyses compared with the other two plume ratios of 1/3 and 2/3. However, the plume ratio of 1 produced more positive errors than the ratio of 2/3, based on the FAR scores (i.e. the model showed greater ash cloud areas than the satellite retrievals). The various plume heights did not differ significantly in the model in general, but the plume height of 7.5 km showed that more model runs having comparably large POD and CSI scores than the other two heights for all the different plume ratios and particle size distributions sampled. The three different particle size distributions (PSD1, PSD2 and PSD3) used in the sensitivity modelling showed slight differences in POD, FAR and CSI analyses with various plume heights and plume ratios, indicating that the various particle size distributions with sizes no larger than 62.5 μm have little effect on the predicted extent of the ash clouds.

The ash cloud detected using the method illustrated in Sect. 5 and the simulated ash clouds at 0300 UTC 17 June with different plume ratios (i.e. model runs 10, 11 and 12) are shown in Fig. 9. Comparing our results qualitatively with the satellite retrievals, the plume ratio of 1 produced the most accurate ash cloud prediction but also produced the most false alarms. Due to the differences in wind directions between high and low levels in the atmosphere shown in Fig. 5c, d, the plume ratio of 1/3 produced the least accurate ash cloud simulation. This illustrates the effect that variations in the plume ratios can have on the modelled ash clouds, due to different winds at different heights.

Fig. 9
figure 9

Brightness temperatures of band 4 (a) and brightness temperature (°C) differences of band 4–5 (b) detected at 0300 UTC 17 June 1996; FLEXPART modelled ash clouds (mg m−3) of model runs 10 (c), 11 (d) and 12 (e) with varying plume ratios shown in Table 4 at 0300 UTC 17 June 1996

7.2 Limitations and future work

This study has demonstrated the sensitivity of the model to the plume ratio. This suggests that in the event of an eruption, operational ash dispersion modellers should focus on obtaining accurate data on the actual plume ratio as well as modelling a range of plume ratio scenarios. Further work is required to determine whether this is limited to specific horizontal and vertical characteristics of the atmosphere. This study also demonstrates the effective use of statistical tests such as POD, FAR and CSI to quantitatively evaluate model performance. These tools may also be used effectively in the operational forecasting of volcanic ash clouds to identify when the input parameters are incorrect and guide the selection of more appropriate input parameters. Further work is needed to examine the stability of the statistical method.

Furthermore, during an eruption, satellite retrievals can help to provide and constrain some of the input parameters for ash cloud modelling. However, quantitative analysis from satellite data has a number of limitations associated with it. Firstly, for particles larger than 16 μm, the discrimination of water droplets from ash particles is difficult. Moreover, if the ash cloud is optically thick, the detection sensitivity can be very poor (Prata 1989; Wen and Rose 1994; Prata and Grant 2001). Therefore, quantitative analysis is only sensitive to ash particles of around 2–16 μm in size, which may account for some of the deviations between the satellite retrievals and the FLEXPART modelling results. Although satellite retrievals can help to provide some input parameters for ash cloud modelling, e.g. plume height, some uncertainties cannot be avoided in the retrieval process. Further improvements in ash cloud modelling could be achieved by using radar to detect the plume height (e.g. Devenish et al. 2012; Turner et al. 2014). However, it is notable that radar-derived plume heights also have a degree of uncertainty associated with them; the radar must be placed quite close to the volcano to get measurements that are free of any significant error (Arason et al. 2011); it is the distance and scanning strategy (i.e. number of scanning angles and their separation) that are important.

This study only considered linear distributions of mass inside the eruption column; future ash cloud modelling studies could consider other distributions (e.g. Poisson). Moreover, future sensitivity studies could consider other eruption source parameters, such as the mass eruption rate and the eruption duration. It is notable that this study only modelled fine ash for the transport of ash clouds in the atmosphere; hence, the particle size distributions used are not consistent with the actual eruption event with regard to the total particle size distribution. Further tests using the eruption source parameters from this study on other eruptions types, as well as on eruptions from other volcanoes, need to be carried out to see whether the findings from this eruption event can be generally applicable to other eruptions for ash cloud modelling.

Relatively few evaluations of the NCEP CFSR data set have been conducted. As such, its performance is not well known. Therefore, the use of NCEP CFSR data for initiating the WRF model in this study may have limited the accuracy of the modelled wind fields. In addition, as with clouds and precipitation, winds are discontinuous and are usually formed by complex nonlinear processes and thus are very difficult to model well. Moreover, since uncertainties exist in the lower boundary conditions, such as the unavoidable errors in the estimates of topography, the forecasting of surface winds always includes relatively large biases compared with the associated observations.

WRF does not retain large-scale information very well as the lead time from model initialisation increases. This means that the model cannot run in a full cycle mode (i.e. the background is provided by a short-term WRF forecast from a previous cycle) and has to be re-initialised every 24 h or less (in our study, the model was only capable of providing relatively accurate forecasts for time periods of up to 11 h). One solution might be the newly developed scheme called “blending” (Wang et al. 2014), which uses a high-pass filter to incorporate large-scale analysis into a model at every cycle; results showed that this method can effectively avoid the collapse of a model after several hours. However, due to the lack of regional analysis data in New Zealand, this method has not been applied locally in either operational or research contexts.

Meteorological conditions, specifically wind fields, during an eruption event, strongly influence the modelled ash clouds. Further work is required to determine whether the coupled model performs well under different meteorological conditions. However, the results indicate that the combined model performance is strongly dependent on the effectiveness of the meteorological model used. More tests need to be performed to see whether other meteorological data used to initialise WRF model can help to improve the accuracy of modelled winds thus improving the quality of modelled ash clouds.

8 Conclusions

Statistical analysis of the coupled FLEXPART-WRF system performance using POD, FAR and CSI has demonstrated reasonable agreement in a limited modelling period with respect to the location of the ash cloud with those from satellite retrievals for the 16–17 June 1996 Mount Ruapehu eruption event. The rich observational data set enabled detailed sensitivity experiments of source eruption parameters (plume height, plume ratio and particle size distribution) to be analysed using the FLEXPART-WRF coupled system.

The key findings of the sensitivity study for this eruption event are as follows:

  1. 1.

    The particle size distribution has been shown to have little effect on ash cloud modelling, since particles larger than 62.5 μm were not modelled, and only the full cloud extent was investigated [similar results were found by Webley et al. (2009)];

  2. 2.

    The plume height has only a small impact on the simulated ash cloud due to the homogeneity of wind directions at the three heights at which particles were released;

  3. 3.

    The plume ratio has the greatest impact on ash cloud modelling due to the wind shear (specifically the changes of wind speed) between high and low levels (see Fig. 10).

    Fig. 10
    figure 10

    Skew-T soundings of temperature, dew point temperature and winds based on NCEP CFSR data for a point near Ruapehu volcano at 0300 UTC 16 June 1996

  4. 4.

    In summary, although the default set of eruption source parameters from Mastin et al. (2009) for ash dispersal modelling can be considered as a good starting point, detailed eruption source parameters should be examined carefully for each modelled volcanic eruption, since these parameters [e.g. the plume ratio analysed in this study but not included by Mastin et al. (2009)] can have potential to make a significant impact on the modelled ash clouds. The results demonstrate that coupling WRF and FLEXPART provides an effective tool for modelling volcanic ash clouds in the short term (up to 11 h). After this point, errors in the meteorological model limited the performance. In geographical areas where model output from WRF is more accurate over longer time frames, it is likely that the combined modelling approach could result in better performance over longer timescales. Further work is required to determine the general sensitivity of the FLEXPART-WRF system to a range of different meteorological and geological settings.