Introduction

The International Monitoring System (IMS) is part of the verification regime for the Comprehensive Nuclear-Test-Ban-Treaty Organization [1] that is designed to detect nuclear explosions no matter where they occur on the earth. The IMS includes seismic, hydroacoustic, infrasound and radionuclide monitoring techniques at a number of locations around the world. When complete, 80 of the IMS stations will have aerosol measurement systems sensitive enough to detect releases from nuclear explosions at great distances and 40 of them will also have xenon measurement systems that detect four radioactive xenon isotopes (131mXe, 133Xe, 133mXe, and 135Xe). These isotopes are produced in nuclear explosions, nuclear power plants and medical isotope production facilities [2, 3].

Natural events, such as earthquakes, may trigger some of the IMS monitoring equipment as well as anthropogenic activities such as releases of radionuclides to the air from nuclear reactors or medical isotope production facilities. Analysts must evaluate the data and decide whether there is a possibility they were caused by a nuclear explosion. Fusing data from several technologies can help build confidence in a source-attribution conclusion.

Source attribution based on atmospheric transport modeling is an emerging science, and in some situations the estimated release time, location and magnitude may differ significantly from real values, especially if the analysis must be performed with a small number of samples. The current work considers the performance of a relatively recent [4, 5] application of a Bayesian technique to the source attribution problem for radionuclides detected in air samples. Bayesian estimation has the potential for improving source attribution estimates over simplistic approaches and the probabilistic treatment provides a level of confidence statement in the results not attainable by deterministic methods.

Theory

Consider a vector of parameters, θ, which describe the properties of a time-transient point source of a contaminant released to the atmosphere: \( \theta = \left( {x_{\text{s}} ,y_{\text{s}} ,z_{\text{s}} ,q_{\text{s}} ,t_{\text{on}} ,t_{\text{off}} } \right) \). In this situation, (x s ,y s ,z s) represents the spatial location of the source (longitude, latitude, elevation), q s is the strength of the source (mass/time) and (t on,t off) is the start and stop times of the release. A number of techniques can be used to choose the best θ given one or more sample values at one or more detectors [48].

Assuming that each component of θ is a random variable, then Bayes theorem can be used to describe a relationship between the following conditional probability density functions (PDFs):

$$ P\left( {\theta |D,I} \right) = P\left( {\theta |I} \right)P\left( {D |\theta ,I} \right)\!/\!P\left( {D |I} \right) $$
(1)

In Eq. 1, \( P\left( {\theta |I} \right) \) is the prior distribution on the parameter vector, \( P\left( {\theta |D,I} \right) \) is the posterior distribution on the parameter vector, \( P\left( {D |\theta ,I} \right) \) is the likelihood function on the data, \( P\left( {D |I} \right) \) is data based evidence, D is a set of measured data values and I denotes external information applicable to the values of the parameter vector.

A probabilistic formulation provides a framework to account for uncertainties in observed and modeled concentration data. Although many different source configurations may be possible, some will be more probable than others given some sampling data. The solution to the source determination problem is the posterior density function \( P\left( {\theta |D,I} \right) \) which represents the probability that the components of the source parameter, θ, take on specific values.

As formulated, the parameter vector θ uses six dimensions to describe a single release. This vector can easily be extended to describe multiple releases. Although we examine other aspects of the estimation problem, one could use a Bayesian formulation to evaluate scenarios to determine whether more than one release contributed to the sample values [9, 10].

To answer a practical question such as what is the distribution of release strength, one must compute the marginal probability density function for q s . Depending on the nature of the external information brought to bear, this may be fairly simple or quite complex, requiring additional techniques such as Monte Carlo Markov chain methods [11, 12]. The results in this paper are based on a Monte Carlo Markov Chain solution to a parametric Bayesian formulation.

The likelihood function quantifies the likelihood of the discrepancy between the measured concentrations D and a corresponding set of modeled concentrations, R. Let the term R i denote the concentration value that measurement i would theoretically measure if the source were characterized correctly by the parameters in θ. In practice R i is calculated using a mathematical model of atmospheric dispersion. The Hysplit atmospheric transport model [13, 14] is used for the calculations reported in this paper.

One way to formulate the discrepancy between the measured and modelled concentrations considers two sources of error: measurement error and transport model error [4]. Under an assumption that the errors are normally distributed, one obtains the following likelihood function:

$$ P\left( {D |\theta ,I} \right) \propto { \exp }\left[ { - \frac{1}{2}\mathop \sum \limits_{i} \frac{{\left( {D_{i} - R_{i} (\theta )} \right)^{2} }}{{\sigma_{D,i}^{2} + \sigma_{M,i}^{2} }}} \right] $$
(2)

Depending on the approach used for evaluating \( P\left( {\theta |D,I} \right) \) the proportionality constant in Eq. 2 and the expression P(D|I) in Eq. 1 can be subsumed into a normalization step without explicit evaluation.

Experimental

Performance of the Bayesian technique is evaluated with two data sets. The first data set uses synthetic sampling data for twelve different releases. The second data set uses data associated with a nuclear explosion [15].

Synthetic sampling data

Synthetic 133Xe sampling data at ten locations were generated assuming a hypothetical medical isotope production facility was located in the central portion of the state of Missouri, USA. A release of 6.45 × 1012 Bq of 133Xe was assumed to occur in a 3 h period starting at 0000 UTC on the first day of each month in 2013. This release is the average daily release for the five largest 99mTc producers [16]. The ten sampling locations were in the states of California, Colorado, North Dakota, Kansas, Texas, Illinois, Arkansas, Florida, Virginia, and Maine. The samplers ranged from 600 to 2500 km from the release point.

The Hysplit code [17] and archived meteorological data [18] were used to calculate synthetic sample values for three different models of samplers. The first sampler [19] is a new system that uses a 6 h sample collection period and can detect concentrations of about 0.05 mBq/m3. The second sampler, named SAUNA [20], uses a 12 h collection period and can detect concentrations of about 0.25 mBq/m3. The third sampler, named SPALAX [21], has a 24 h collection period and can detect concentrations of about 0.1 mBq/m3.

Model discrepancy variance

Analysis of the data collected by real samplers provides the data variance term in the denominator of Eq. 2. However, no equivalent method is available with which to estimate the model variance term. The standard deviation of the data is often approximately expressed as a fraction of the data value. For this analysis, it is assumed that the standard deviation of the model discrepancy is a fraction of the modeled value. Unfortunately, the model variance term has a large effect on the spread of the posterior distributions for location, time and magnitude. Release information is known for the synthetic data, and a model discrepancy standard deviation of about 0.35 provided better posterior expected location and magnitude estimates than other values. Therefore, the results in this paper use a value of 0.35 for all calculations. Additional work is needed to develop a strong basis for selecting the model discrepancy variance. The mode of the distribution of release magnitude is relatively insensitive to the model discrepancy term and is a better candidate for a point estimate of magnitude than the expected value of magnitude.

Possible source regions

The simplest description of a possible source region is the set of locations where a release at some time in a defined period of time (such as 5 days) before a sample value is taken could have yielded the sample value. Possible source regions are illustrated in Fig. 1 for one of the synthetic data sets. The left pane shows the region where a release of 1017 Bq or less of 133Xe sometime in the 5 days before a 6 h duration sample was taken in Virginia could have resulted in the sample value. The right pane shows the combination of four possible source regions for four consecutive 6 h samples at the same location. The model domain is truncated to just larger than the continental USA borders, thus the possible source regions actually are larger than shown here. Possible source regions are often continental in size; making release location estimates imprecise. It is tempting to use the intersection of multiple possible source regions to restrict the locations being considered for the release point. However, atmospheric transport models only approximate the movement of air and the intersection of all the possible source regions may be the null set when multiple samples are available from different locations. A good estimation approach provides the best possible solution even when the data are discordant.

Fig. 1
figure 1

Possible source region based on one 6-h sample value (left pane) and common possible source regions for four consecutive 6-h sample values (right pane)

Results and discussion

Analysis using synthetic data

In general, the analysis cases for the synthetic data estimate the time, magnitude and location of the release. The Bayesian formulation still applies if some of the release parameters are known (they have a specified value with probability one). Four different cases were evaluated for each release: (1) only samples with concentrations above the detection limits are used, (2) all samples, including those below detection limits, are used, (3) samples with concentrations above the detection limits are used and the release time is constrained to a 6 h window centered at the release time, and (4) only samples with concentrations above the detection limits are used and the release location is specified.

Four metrics were used to evaluate the estimator performance. The first metric was the distance (km) between the release point and the most likely (mode) release location. The second metric was based on the size (km2) of the region encompassing the central 90 % of the posterior distribution on location. The third metric was the offset in time (h) between the start of release and the mode of the posterior time distribution. The fourth metric was the mode of the posterior distribution on release magnitude, expressed as a ratio with the magnitude of the release.

As expected, the number of samples with detectable concentrations differs with the type of sampling equipment. The number of samples with concentrations above detection limits by month and sample collection period are provided in Table 1. Even though 12 releases were modeled, the performance measures are based on 11 releases because only one 6 h sample for the January release had a detectable concentration.

Table 1 Number of synthetic samples with detectable concentrations by release month and sample collection length

Average values for the performance metrics for different analysis scenarios are provided in Table 2. In general, analyses that are not constrained in time, or constrained by a reasonable magnitude limit, tend to have the mode in the release time distribution well before (often 1–3 days) the real release time. As a consequence, the estimated magnitude is too high and the estimated location is offset from the real release location.

Table 2 Bayesian estimator performance for different analysis scenarios using average values of the metrics

Cases using a 6 h collection time perform better than cases with longer collection times. This may be partly due to the general observation that cases with more samples tend to produce better estimates than cases with a smaller number of samples. In addition, samples with a 6 h collection time are averaged over a smaller volume of air than samples with 12 h or 24 h collection times, which reduces the size of the averaging area.

In general, including samples with concentrations below the detection limit significantly improves the estimation accuracy. Estimation of the release time improved dramatically; from an average of being wrong by several days to being wrong by a few hours. This improvement depends on the sampling network being dense enough that part of the possible source regions for the samples with concentrations below detection limits overlaps with the possible source regions for the samples with detections.

Knowing the time of the release or the location of the release significantly improves estimation accuracy over unconstrained cases. In addition, cases where the main body of the plume moved over one or more samplers have much better estimation accuracy than cases where the majority of the plume bypasses the samplers. This may be partly due to the difficulty of accurately resolving low concentrations in the atmospheric transport model.

The sample cases with the smallest plausible source regions tend to have the most accurate estimates for location, magnitude and time. This observation suggests that it may be possible to develop ad hoc rules for the overall confidence that should be placed in a release estimate based on the size of the plausible source region.

Although not embodied in a performance metric, the resolution of the atmospheric transport model has a strong effect on the accuracy of the estimated release parameters. This analysis used the particle tracking mode of the Hysplit code. Initial estimation cases used a low number of particles and yielded poor results. The cases were rerun using 2 × 107 particles per run for generating the synthetic data and 2.5 × 106 particles for each analysis run. Exploratory cases indicated that using more particles would have improved the analysis run results, but the total number of CPU hours became prohibitive.

Analysis using real data

The Democratic People’s Republic of Korea (DPRK) conducted its third announced nuclear test at 02:58 UTC on February 12, 2013 at the Pungye-ri nuclear test site. This test was detected by seismic stations operated by the IMS. All of the IMS radionuclide stations in the local region were operating at the time and all samples for the next few weeks were consistent with historical samples. Beginning on April 7, 2013, several samples with unusual combinations of 131mXe and 133Xe were detected at IMS stations in Takasaki, Japan and Ussuriysk, Russia. These data have been analyzed [15] and shown to be consistent with delayed releases from an underground nuclear explosion. All of the IMS samples used in this analysis came from SAUNA [20] samplers and use a 12 h collection time.

The Bayesian estimation approach is used with 133Xe data from IMS samplers in a two-step process using the assumption that all samples were caused by a short-duration ground-level release at a single location. First, the posterior distribution of location is examined for consistency with an assumption of a release at the Punggye-ri site. Second, the release location is specified [22] and the posterior distribution of release time and magnitude is calculated. The Hysplit code and 0.5 degree space and 3 h time resolution meteorological data [23] are used in the analysis.

The posterior probability distribution of release location based on three 133Xe samples in Takasaki, Japan is shown in Fig. 2 for the time period of 4 days before the first detection. Dark grey denotes the highest probability and light grey denotes the lowest probability. Contour steps are approximately 0.1. Atmospheric transport is a complex process and in this time period some near-surface air originating in central China can reach Japan as quickly as near-surface air originating in DPRK. As a consequence, using only three sample values, there are two distinct regions where a release might have yielded the sample values measured in Japan.

Fig. 2
figure 2

Posterior probability of release location using three samples collected in Takasaki, Japan. Darker grey regions are higher probability than light grey regions

Two other IMS samplers in China and Mongolia did not detect 131mXe and 133Xe at any time during this time period. Similarly, a number of samples at IMS stations in western Russia and Japan did not detect these nuclides. When the samples without detections are added to the data set, the plausible release region in central China essentially disappears. This calculational result is consistent with other analysis of these data [15] and illustrates the improvement in location estimation achieved by including samples where the isotopes of interest were not detected.

The Bayesian posterior distribution for location suggests that a release at the location of the Punggye-ri nuclear test site at some time could have resulted in the sampled values. A secondary analysis fixed the location of the release event at the test site [22] and estimated the release time and magnitude. The posterior probability densities for release time and magnitude are shown in Fig. 3 along with three curves showing the amount of release at different times needed to produce the three sample values. The curves are based on backwards (in time) atmospheric transport runs. As is often the case (because atmospheric models only approximate the movement of air) these three curves do not all coincide at any single time-magnitude pair. If they did coincide at only one point, one could interpret that time-magnitude pair as describing the event.

Fig. 3
figure 3

Posterior probability of release time and magnitude for 133Xe using three samples from Takasaki, Japan

The Bayesian approach provides probability distributions on the components of the release rather than single estimates. The expected value or the most likely values (mode) from the posterior distributions are sometimes used as point estimates. Based on these three samples, the most likely values for this case are a release of about 2 × 1013 Bq of 133Xe around 0600 UTC on April 7. The expected values are a release of about 1.2 × 1014 Bq of 133Xe around 0400 UTC on April 7. Analysis of the synthetic data sets suggest that the mode values are more accurate descriptions of the event than the expected values.

The authors in the previous analysis [15] estimate a release of 7 × 1011 Bq of 131mXe but they don’t provide a release estimate for 133Xe. They also argue for two different release episodes, a situation which isn’t explored in this analysis. The cumulative yield of 133Xe from a nuclear explosion is about 100 times that of 131mXe [24]. Thus, this estimate for the release of 133Xe is consistent with release estimates of 131mXe using different analysis techniques and different atmospheric transport models.

Conclusions

The Bayesian estimation approach shows promise for general use in the source attribution problem. Results using synthetic data demonstrate that samplers with 6 h collection times perform better on a suite of performance metrics than samplers with 12 h or 24 h collection times. A modeling case using real 133Xe data associated with a nuclear explosion produces location, timing and release magnitude results consistent with detailed analyses by other authors.

The Bayesian approach is unique in that samples with concentrations below the detection limit for the equipment can explicitly be included in the analysis. The utility of including the additional samples is demonstrated for both the synthetic data and real 133Xe data collected after a nuclear explosion. With 10 sampling locations in the continental United States, including the additional samples reduced the size of the plausible source region by an order of magnitude and greatly improved estimation of the time and magnitude of the release.

If the posterior probability distribution of the source location is diffuse in space, then there is low confidence that the real release location has been identified well enough for practical use. Release time and magnitude are tightly coupled with release location, so low confidence in any of these three items induces low confidence in the other items. Based on a dozen cases using synthetic data, the size of the estimated plausible source region can be used as a general indicator of overall confidence in the estimation results.