1 Introduction

Several phenomena created by nuclear explosions can be used to remotely monitor for their occurrence (Maceira et al., 2017). The International Monitoring System (IMS) (CTBTO PrepCom, 2019) is composed of seismic, hydroacoustic, infrasound, and radionuclide (RN) networks that monitor the earth for events that would violate the Comprehensive Nuclear-Test-Ban Treaty (1996).

Industrial activities, such as medical isotope production facilities, nuclear research reactors, and nuclear power reactors, also release radioactive materials to the atmosphere. Quantifying the impact of these anthropogenic backgrounds of 140Ba, 131I, and 133Xe on IMS radionuclide stations is a major goal of this work.

The historical design-basis studies in working paper WP.224 (IMS Expert Group, 1995) considered 50-, 75-, and 100-station networks for 140Ba and also a 100-station network for 133Xe, with a caveat that there was insufficient time to evaluate 50- and 75-station 133Xe networks. The CTBT calls for 80 RN stations, but one of the 80 was not assigned coordinates. Also, noble gas samplers are planned for 40 stations, although the possibility exists to install them at additional RN locations after the treaty goes into force. Therefore, the noble gas results provided here consider both 39- and 79-station networks, using the coordinates defined in the text of the Treaty.

The WP.224 aerosol detection design goal is > 90% probability of detecting a 1 kiloton TNT equivalent atmospheric nuclear test within 10 days (Schulze et al., 2000) using the isotope 140Ba. While nuclear fission of actinides creates many radioactive isotopes, 140Ba is considered a top signature because of its high fission yield, favorable decay scheme, and moderate half-life which facilitates detection at global scales.

1.1 Isotopes Released from Historical Nuclear Tests

Radioactive xenon is an important indicator used in detecting underground nuclear explosions, as noble gases are by far the most likely to leak through geologic or engineered containment. Dubasov (2010) recorded xenon detections for over 40% of underground nuclear explosive tests conducted at Semipalatinsk, for example. By comparison, many aerosol isotopes are generated in a nuclear explosion, but only a few have been observed at the surface from underground nuclear explosions.

Isotopes useful for monitoring should have half-lives long enough to allow the isotope to travel 1000 km or further downwind before it decays below detectable levels, but not so long that it does not register many decays in a measurement period. Likewise, the fission yield should be as high as possible to increase detection probabilities. As will be shown in Sec. 2.1, in the case of 133Xe, the maximum inventory is reached about 3 days after the fission occurs if 133I is held together with the xenon.

Both 133Xe and 135Xe were frequently observed leaking from U.S. underground nuclear explosive tests (Schoengold et al., 1996). Due to their relatively large yields and inert nature, 135Xe is much less likely to be detected far from the release location because of its relatively short half-life. The number of times selected isotopes were observed leaking from U.S. underground explosive tests (Schoengold et al., 1996) are tabulated in Table 1 along with their half-life and fission yield. The IMS noble gas network frequently detects 133Xe but rarely detects other xenon isotopes despite their release by nuclear power plants and medical isotope production facilities.

Table 1 Frequency of detection by isotope for the 824 underground nuclear explosive tests conducted in Nevada (Schoengold et al., 1996)

Next-generation noble gas systems have just been being deployed (Ringbom et al., 2017) in the IMS or are nearing the completion of pre-deployment tests (Chernov et al., 2021; Hayes et al., 2013; TBE, 2020; Topin et al., 2020). These systems will have better detection thresholds for all xenon isotopes and may result in more detections from background sources.

Molybdenum, barium, and lanthanum are considered ‘refractory,’ i.e., non-volatile, however, barium and lanthanum isotopes have short-lived xenon precursors, 139Xe (T1/2 = 39.7 s) and 140Xe (T1/2 = 13.6 s). Perhaps the 140Ba and 140La entries in Table 1 are observed with a greater frequency than 99Mo because of very prompt leaks of 139Xe and 140Xe rather than direct emission of the refractory particles.

In Table 1, the frequency of detections at U.S. underground nuclear explosive tests are listed, but this frequency is without regard for the relative sensitivity of measurement systems employed. Systems in the IMS today have noble gas sensitivities in the 0.2–0.5 milli Becquerel per cubic meter (mBq/m3) range, while aerosol systems have sensitivities around 10 µBq/m3. This is because the aerosol minimum detectable concentration (MDC) formula depends (Miley et al., 2019) on the inverse square root of the sampled volume, which for an aerosol system is as much as a thousand times larger than a xenon system, but still using less power than that needed to remove tens of cubic centimeters of xenon from tens of cubic meters of air. Thus, it is possible that the reported ratio of xenon to aerosol detections in Schoengold et al. (1996) are skewed toward aerosol detections by a factor of 20 or more. In any case, for underground test leakage, the three iodine isotopes 131I, 133I, and 135I were detected far more often than the aerosol isotopes 99Mo and 140Ba that are favored for a direct release of fission products to the atmosphere.

1.2 Types of Background Radionuclide Signals

Like many measurement systems, IMS radionuclide measurement systems must contend with background signals. These can come in two varieties and are dealt with quite differently. First, there is natural radioactivity in Earth’s atmosphere. For aerosol systems, radon decay products such as 212Pb, 212Bi, and 208Tl collected on the sample provide a background of Compton scattered gamma ray signals across a wide spectral energy range, obscuring the gamma ray signals below 2615 keV. These background signals increase the MDC of 140Ba, 131I, and all other aerosol isotopes of interest, because the gamma ray signals from the isotopes of interest must significantly exceed fluctuations in the background signals to be registered. Because these decay products originate from radon upwelling in continents, their signals are stronger at interior continental locations than at coastal locations, which are in turn stronger than at island locations. The daily 212Pb concentrations at RN79 (Oahu, Hawaii, USA) and RN70 (Sacramento, California, USA) shown in Fig. 1 illustrate the large variation in 212Pb background concentrations. As seen in Fig. 1, there are also significant seasonal variations at many locations. The 140Ba MDC values for the network vary from 3.1 to 23.2 µBq/m3 as shown in Table 9 in the appendix and corresponding MDC fluctuations would occur in all other isotopes. Aerosol removal due to rain and radon variation due to barometric pressure changes add an additional element of variability to these signals. In xenon measurement systems, variations in radon can likewise impact the spectra results, but filtration, chemical separation, and energy spectrum analysis are employed to greatly minimize this effect.

Fig. 1
figure 1

Daily 212Pb concentrations for 9 years at RN79 (Oahu, Hawaii, USA) (left panel) and RN70 (Sacramento, California, USA) (right panel). The same vertical scale is used on both graphs to emphasize the large variation of values at different locations

A second type of background occurs when activities unrelated to nuclear explosions release the same isotopes of interest. Nuclear power plants (Kalinowski & Tatlisu, 2020; Kalinowski & Tuma, 2009) and medical isotope production facilities (Bowyer et al., 2013; Saey, 2009; Saey et al., 2010a, 2010b; Stocki et al., 2008; Wotawa et al., 2010) and nuclear research facilities (Hoffman & Berg, 2018) also release the same xenon isotopes to the environment as a nuclear explosion. Each of these sources of fission products has an associated leakage rate and have been studied in relation to IMS signals (Achim et al., 2016; Gueibe et al., 2017; Schoeppner & Plastino, 2014). Figure 2 shows the monitoring locations in the IMS, the locations of 181 active nuclear power production facilities, and the locations of the 11 largest fission-based producers of medical isotopes. The facile assumption that nearby IMS stations suffer the most from anthropogenic backgrounds is generally true but quantifying the impact on both nearby and distant stations is challenging and a major goal of this work.

Fig. 2
figure 2

Map of IMS radionuclide monitoring locations, nuclear power reactors, and medical isotope production facilities

2 Methods and Data

To study the impact of background radioactivity on a network of RN sensors, the network design, sensor sensitivity, and the intended source term must be considered. In this study, we take as a given the 79 locations for radionuclide monitoring in the IMS and the sensitivity of currently deployed IMS systems. These will be considered versus a wide range of source strength intended to explore the entire range of network response. Rather than a simple formula for detection sensitivity, the authors also employ a frequency-based approach for some backgrounds, such that an anomaly or action threshold is achieved at the 95th percentile of frequently seen backgrounds. In the future, as the natural backgrounds and many anthropogenic backgrounds become sufficiently well-known and, on average, predictable, studies could be done to predict the performance of different networks, different sensors, and different detection criteria.

2.1 Source Terms for Network Detection Analyses

The original IMS design document, WP.224 (IMS Expert Group, 1995), identified the magnitude of the aerosol source term M = 2 × 1015 Bq of 140Ba and presented the rationale that this activity corresponds to lofting 90% of the fission products from a 1 kiloton fission explosion in the atmosphere. As seen in Table 1, 131I is far more likely to be released from a nuclear explosion contained underground than other aerosol species. Despite the original thought of a network of aerosol samplers developed to detect radioactive releases from an atmospheric nuclear explosive test, this study will consider the utility of 131I leakage from underground nuclear explosive tests. Compared to 140Ba, this isotope also has a high fission yield (3%), a favorable decay scheme for gamma ray spectroscopy, and a useful half-life of 8.24 days. There have been a number of 131I detections in the IMS, presumably from the production and use of medical isotopes, so it is conceivable that backgrounds could hamper the use of the isotope for detecting underground nuclear explosive tests. Other fission and activation products (e.g., legacy 137Cs and cosmogenic 24Na) are frequently detected by the IMS, but because of the importance of iodine releases from U.S. underground nuclear explosive tests, the authors will use 131I to explore the impact of aerosol backgrounds.

The xenon source term in WP.224 is differentiated between a 133Xe source magnitude of 1015 Bq for ‘evasive atmospheric tests’, where rain eliminates aerosols, but 90% of instantaneous xenon isotopes are lofted, 1015 Bq of instantaneous release for underwater nuclear explosions, and 1014 Bq released over 12 h, or 10% of the xenon, for a 1 kiloton underground nuclear explosion. The release timing is important, because the radioactive precursor to xenon is iodine, which is less likely to escape from underwater or underground explosions. The amount of 133Xe as a function of time past a nuclear explosion with a 1 kiloton yield for cumulative xenon and iodine isotopes and fractionated 133Xe is shown in Fig. 3. Isotopic inventories were generated using a combination of the MCNP6 code (Goorley et al., 2012) for estimation of neutron fluxes and the ORIGEN2.2 code (Croff, 1980) to calculate the resulting irradiation material balances from those fluxes. From the fractionated 133Xe curve (lowest curve in the plots), it is evident that leakage of 133Xe from a nuclear explosion during the earliest hours would be strongly suppressed due to the lack of 133I ingrowth during later containment. The average from the cumulative 133Xe curve during the earliest 12 h, as in WP.224, is 6 × 1014 Bq before loss of containment. The WP.224 estimate implies a 17% leakage of 133Xe over this time.

Fig. 3
figure 3

Comparison of some 235U fission products from a one kiloton equivalent nuclear explosion: the initial independent quantity of 133Xe, compared with the ingrowth and decay of 133I cumulating into 133Xe. Also, 131I is shown. Note that 133mXe, not shown, also decays into 133Xe. The log-time left panel illustrates the rapid growth of 133Xe due to 133I decay that is lost when the xenon is released (fractionated) at early time

To fully understand the range of responses of the 79-station network, it will be tested with a range of source strengths wide enough to determine where the network sensitivity begins, and where it maximizes. The source strengths used in this study run from 109 to 1016 Bq for the three isotopes mentioned above, 140Ba, 131I, and 133Xe, such that this activity range corresponds to nuclear explosion yields in the atmosphere from 100 g to 1 kiloton.

2.2 Minimum Detectable Concentrations and the Influence of 212Pb

The original IMS design document (IMS Expert Group, 1995) calls for an MDC of 10 µBq/m3 for 140Ba, achieved in a 3-day sampling period including collection and measurement. The MDC for aerosols depends on several factors related to the decay of background sources and of the analyte of interest, including the efficiency of the detector and the branching fraction of the isotope in a region of interest. The MDC equation (Miley et al., 2019) includes the square root of the observed count of background signals that occur in the region of interest for an isotope. This can be thought of as related to the uncertainty or fluctuation of the background signals. Many such MDC formulations could be created, but many use the statistical approach similar to that described by Currie (1968), in which a 5% false positive and 5% false negative choice is made. In general, the MDCs for different isotopes can differ greatly (a couple of orders of magnitude or more), and their response to different background concentrations can be different.

The concentration of 212Pb is usually thought to be the main driver of background signals in IMS aerosol systems and varies widely from day to day and location to location. The MDCs of many analytes are calculated for each sample, but most of these are strongly impacted by 212Pb. The IMS has gradually expanded since 2000 to include 72 of the 80 planned IMS stations. Sample measurement results include the MDC of 140Ba and 131I achieved in each measurement. A summary of the relevant historical MDC data for each station is provided in Appendix 1 for approximately 200,000 measured samples since the beginning of 2012, i.e., not including Fukushima-related measurements. The authors selected data from Reviewed Radionuclide Reports (RRR) that pass air flow-rate quality checks, acquisition time (counting) quality checks, and have a 212Pb concentration of no more than 400,000 µBq/m3. Information on the 212Pb concentrations are included here because high 212Pb concentrations can reduce the sensitivity of a detector (Werzi, 2010).

Setting target MDC values is useful when determining the ability of a sampling network to monitor for specific isotopes. However, after the target level was set and sampling systems were developed, the collected data showed that no sampler design can achieve the target MDC for 140Ba of 10 µBq/m3 in areas where the background levels of 212Pb are extremely high. Thus, the definition of the target MDC level was revised to apply to the situation where no 212Pb is present in the samples. A computational technique was then created to extract the performance the station would give with no 212Pb. This approach uses groups of MDC values for an isotope (e.g., 131I or 140Be, etc.) and depends on the assumption that background counts originate from 212Pb. The group of sample MDCs are fitted with a constant value, representing the radioactivity intrinsic to the system, and a term linear in the 212Pb concentration. The following equation uses the 212Pb concentrations, Pbi, and the MDC values, MDCi, from a group of samples (indexed by i):

$${MDC}_{i}^{2}=a+b\cdot {Pb}_{i}$$
(1)

where a and b are fitting constants obtained from a linear regression model.

Radionuclide Aerosol Sampler/Analyzer (RASA) systems (Miley et al., 1998) are deployed at 20 IMS radionuclide sampling stations. For aerosol sampler/analyzer systems, the concentrations and MDC have units of µBq/m3. Using 3722 samples collected in 2019 at 11 RASA systems operated by the United States for the IMS, one obtains the following functional fits for 140Ba and 131I given the 212Pb concentration, Pb, in each sample:

$${MDC}_{Ba140}=\sqrt{72.32+0.02546\cdot Pb}$$
(1a)
$${MDC}_{I131}=\sqrt{8.86+0.03318\cdot Pb}$$
(1b)

When the 212Pb concentration is zero, MDCBa140 = 8.50 and MDCI131 = 2.98 µBq/m3 for these RASA systems.

The square of the sample MDC is plotted against the sample 212Pb concentration in Fig. 4, showing that a linear fit is reasonable, and supports the assumption that 212Pb and its decay products dominate background contributions to the MDC. But because the fitted line does not go through the origin, one can be certain that 212Pb does not represent the entirety of the background signals.

Fig. 4
figure 4

The square of measurement MDCs for 131I (left pane) and 140Ba (right pane) for 3,722 samples collected at 11 different RASA samplers in 2019 for a range of 212Pb concentrations. The functional fits to adjust the MDC for varying levels of 212Pb concentrations are shown by the dotted grey lines

As shown in Fig. 1, the 212Pb concentrations vary from day to day, and also show seasonal fluctuations, thus the 140Ba and 131I MDCs also have daily and seasonal variations. An example of the range of MDCs is provided in Table 2 for 11 stations using RASA equipment. A station on Wake Island in the Pacific Ocean (RN77) has the smallest range. This makes sense, because the 220Rn that produces the 212Pb mostly comes from atmospheric radon released from spontaneous fission of uranium and thorium in the surface rock. Of these 11 stations, the widest 212Pb concentration range is found in RN74, which is located at Ashland, Kansas, near the center of North America.

Table 2 Sample MDC statistics for the 140Ba MDC in µBq/m3 for 2012 through 2020 at 11 stations operated by the United States for the International Monitoring System

After adjustment for background 212Pb, the diverse group of IMS aerosol systems currently meet the target 10 µBq/m3 MDC level for 140Ba, although the lower 212Pb background levels in some locations result in better (lower) MDCs. Data on the historical sample MDC values for 140Ba and 131I is provided in Table 9 in the Appendix for all IMS stations.

Three types of noble gas samplers are currently deployed in the IMS. The SAUNA (Ringbom et al., 2003) has a 0.2 mBq/m3 MDC for 133Xe using 12-h samples. The SPALAX (Fontaine et al., 2004) has a 0.15 mBq/m3 MDC for 133Xe using 24-h samples. The ARIX (Dubasov et al., 2005) has a 0.5 mBq/m3 MDC for 133Xe using 12-h samples. Information provided in Appendix 1 matches the sampler type with different sampling locations.

2.3 Creating an Anomaly Level Using Historical Detections, and Applying to 131I

Above, the limitation that uninteresting radioactivity imposes on monitoring for atmospheric signatures was discussed for aerosol systems. The second perplexing source of monitoring interference is the appearance of the specific signatures of interest arising from uninteresting processes. An example of this can be seen from 2019 133Xe data from one station in Fig. 13 in the Appendix. This system collects and measures two samples a day with an MDC of about 0.2 mBq/m3. The normalized integral of signals is also shown, such that by inspection it is clear that about 20%, or 150 reported signals are above the MDC of the system. If one assumes that there were no nuclear explosions in 2019, this distribution of background signals represents the noise above which a 133Xe signal would have to rise to garner interest.

The authors chose the 95th percentile of the distribution to be an anomaly. In other words, a signal greater than 95% of the background might trigger additional study. There are other reasons to trigger such a study—if any other explosion-related signals are seen. This might include other isotopes of xenon, aerosols, or even vibrations in the Earth, oceans, or atmosphere. In this instance, however, only the detection of one isotope is considered. The choice of 95% is not completely arbitrary, as the statistical approach of Currie (1968) chooses this level of statistical background fluctuation to set the MDC.

At this point, an action threshold can be constructed with the maximum of either the MDC or the 95th percentile. For stations that observe the analyte often, the action threshold would be the 95th percentile, and for others, any signal exceeding the MDC would garner monitoring interest. While Fig. 13 in the Appendix represented 133Xe at one station for one year, the situation is quite different for 131I. The frequency of historical 131I detections in the entire IMS in the years 2012 through 2020 are shown in Fig. 5 as a function of concentration. About half of the detections have concentrations above 3.52 µBq/m3 which is not too different from the MDC for that isotope. These 1234 detections occurred in 194,162 total samples.

Fig. 5
figure 5

The frequency of 131I detections as a function of concentration. Data are global and start in 2012, which is long enough after the Fukushima event that related 131I will have decayed. The upper end of the horizontal axis (above 20 µBq/m3) is not to scale

Significant levels of 212Pb are present in many of the samples with 131I detections and cause substantial variation in the daily 131I MDC. The concentration levels for 212Pb in Table 3 are illustrated for the 9 samples with detected 131I collected at RN70 (Sacramento, California, USA) and the 131I MDC varies from 5 to 9 µBq/m3. Over half of the samples contain reviewer-confirmed 131I concentrations found below the calculated MDC for that sample. Only one sample is substantially above the MDC for the relevant day at the station.

Table 3 Influence of 212Pb concentrations on the MDCs for 140Ba and 131I for samples with detections of 131I at RN70 (Sacramento, California, USA)

There were one or more 131I detections at 45 IMS aerosol stations during 2012–2020. The average and 95th percentile of 131I concentrations at the IMS stations with 10 or more detections are shown in Table 4. Below this number, the 95th percentile becomes difficult to accurately estimate and is quite similar to the MDC. In Table 4, the 95th percentile ranges from about 13 times the MDC to about the same as the MDC. For half the entries in Table 4, the MDC is at most doubled by the 131I background. Thus, only 5 IMS aerosol stations have a particularly significant background effect for 131I. The number of 131I detections for all IMS stations is provided in Appendix 1.

Table 4 The number of 131I detections and the average and 95th percentile of the 131I concentrations of the detections for all IMS stations with 10 or more detections from January 2012 through February 2021

2.4 Atmospheric Transport Model

All atmospheric transport results in this paper used the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model (Stein et al., 2015), which incorporates both advective and diffusive processes. The transport runs were performed using the Linux version of HYSPLIT. Default model parameters (Draxler et al., 2020) were used. For example, vertical turbulence was simulated using the Clayson and Kantha (2008) model and horizontal turbulence is proportional to the vertical turbulence. The boundary layer stability was computed from heat and momentum fluxes read from the meteorological data files. The top of the atmospheric model domain was set to 10,000 m above ground level.

Wet and dry deposition mechanisms were deactivated for the xenon transport runs because xenon is a noble gas and its concentration in the air does not depend on rainout or deposition processes. Wet and dry deposition were included for both 140Ba and 131I using the assumption the transport was in particulate form. Iodine can transport as inorganic, elemental, and particulate species, and can change forms during transport. In addition, the speciation depends on the local humidity (Fitzgerald, 1975; Winkler, 1973). Modeling the different iodine species is difficult, and while it is necessary when dealing with concentrations high enough to impact human health (Eslinger et al., 2014b), this work only used the particulate iodine form. This assumption may underestimate the detection probabilities for 131I.

The transport runs used to determine network performance used archived meteorological data for 2019 on a 1° spacing and 3-h time step (GDAS1, 2020). Each run modeled plume movement for 10 days and saved concentration data on a global 0.5° grid. One of the techniques developed to reduce the computational burden in a source-term analysis with far fewer samplers than possible release points is to use atmospheric transport runs done backwards in time (Hourdin & Talagrand, 2006; Hourdin et al., 2006; Rao, 2007; Seibert & Frank, 2004; Stohl et al., 2002). Although time-reversed runs don’t exactly match with forward-time runs (Eslinger & Schrom, 2019), these runs were performed in the reversed-time direction for computational convenience.

All releases from hypothetical nuclear explosions are assumed to be 3 h in length and the releases are transported through the atmosphere for 10 days after the release. This work models 8 releases per day for an entire year at a large number (258,839) of unequally spaced locations. The locations were selected on a global grid with constant 0.5° spacing in latitude and longitude. Thus, the detection statistics are based on 7.55 × 108 possible releases.

2.5 Network Performance Model

As mentioned above, the detection design goal for the IMS radionuclide (aerosol) network performance was to achieve a 90% probability of detecting an atmospheric test of 1 kiloton TNT equivalent explosion within 10 days (IMS Expert Group, 1995; Werzi, 2009). In this work, a network detection metric is used to assess the probability of detection for a release at an unknown time and unknown location. As noted by Kalinowski (2001) and others, radionuclide samples and atmospheric transport modeling can be used for several purposes in addition to just detecting a release, such as locating the release point. Thus, we define three network metrics. The first metric is the probability a release is detected. The second and third metrics are the expected number of detecting samples and the expected number of detecting stations. In general, obtaining more samples with detections at more stations helps in determining the timing and location of the release event.

Consider the situation where there are Ns radioisotope samplers at different locations around the globe. In an abstract sense, each sampler has a probability of detecting a specific future release of a radioactive isotope. This detection probability depends on the location of the release, the magnitude of the release, the specific radioisotope, the sampling duration, the detection sensitivity of the equipment, and future atmospheric circulation patterns. One way to compare system performance for different network configurations is to define a detection metric for the entire global network. The same type of metric can be applied for smaller regions.

Suppose that the surface area of the globe is partitioned into a large number, Nr, of regions, each with surface area Ai, for i = 1,…, Nr. In general, the surface area of the regions may not be equal. Then, denote the probability, P(M,Rs), that a radioisotope release of a given magnitude, M, occurring in a specific region Rs at a series of times tj, for j = 1, …, Nt is detected by one or more samplers. This can be expressed in the following mathematical form:

$$P\left(M,{R}_{s}\right)=\frac{1}{{N}_{t}}{\sum }_{j=1}^{{N}_{t}}I\left(One\; or\; more\; samplers\; detected \;release \;j\; from {R}_{s}\right)$$
(2)

where I(·) is the indicator function, which takes the value 1 when true and 0 when false. The network detection metric, D(M), defined for a release of magnitude M, given a network of Nk sampling locations, and a surface area partitioned into Nr regions, is:

$$D(M)=\frac{{\sum }_{i=1}^{{N}_{r}}{A}_{i}\times P\left(M,{R}_{i}\right)}{\sum_{i=1}^{{N}_{r}}{A}_{i}}$$
(3)

Below, we use the notation D(15)39 to denote a release magnitude of 1015 Bq and a network size of 39 stations. If the surface areas of the regions are all of the same size, such as used by Schoeppner and Plastino (2014), then Eq. 3 simplifies to an average value:

$$D(M)=\frac{1}{{N}_{r}}{\sum }_{i=1}^{{N}_{r}}P\left(M,{R}_{i}\right)$$
(4)

The network metric satisfies \(0\le D(M)\le 1\). A value of 0 indicates that a release of magnitude M will not be detected by any sampling station, no matter when or where the release occurs. A value of 1 indicates that a release of magnitude M will always be detected by at least one sampling station, no matter when or where the release occurs.

Determination of the release location from radionuclide samples becomes more important for small underground nuclear explosions than for large atmospheric nuclear explosive tests. Seismologists need a minimum of three sampling stations to pinpoint the epicenter of an earthquake. Similarly, samples at multiple locations helps in a source location analysis using airborne radionuclides. The detection metric, as formulated here, declares a detection of a release event if the concentration at one or more samples at one or more locations exceeds the detection threshold denoted by the MDC in response to that release.

We define, N(M), the expected number of detecting stations over all release events as:

$$N(M)=\frac{1}{{N}_{r}\sum_{i=1}^{{N}_{r}}{A}_{i}}{\sum }_{i=1}^{{N}_{r}}{A}_{i}\left(\sum_{j=1}^{k}{I}_{j}\left(M,{R}_{i}\right)\right)$$
(5)

where \({I}_{j}\left(M,{R}_{i}\right)\) is 1 if station j detects a release of magnitude M in the region Ri in one or more samples and takes a value of 0 otherwise.

We define, S(M), the expected number of detected samples over all release events as:

$$S(M)=\frac{1}{{N}_{r}\sum_{i=1}^{{N}_{r}}{A}_{i}}{\sum }_{i=1}^{{N}_{r}}{A}_{i}\left({C}_{i}(M)\right)$$
(6)

where \({C}_{i}\left(M\right)\) counts the number of samples, across all stations, that detects release i of magnitude M.

3 Results

Network performance results for 140Ba are provided in Sect. 3.1, including a comparison with performance predictions made in 1995. Section 3.2 contains network performance results for 131I. Finally, network performance results for 133Xe are provided in Sect. 3.3.

3.1 Network Detection Performance for 140Ba

Historical estimates of the detection performance an aerosol network for 140Ba for networks with 50, 75, and 100 stations for a range of transport times are shown in Fig. 6 for a design basis release of 1015 Bq of 140Ba to the atmosphere and an MDC of 10 µBq/m3. Figure 6 is derived from Fig. 3-1 of WP.224 (IMS Expert Group, 1995). The detection probability, D(M), calculated using Eq. (3), is shown for a 39-station network (black circle) and a 79-station network (black triangle) event and historical data for detection limits. The current performance estimate for 140Ba matches nicely with the historical estimates of the design-basis performance.

Fig. 6
figure 6

Historical estimates of detection probabilities for different network designs, each assuming a release of 1015 Bq of 140Ba. Our calculated performance of a 39-station network (black circle) and a 79-station network (black triangle) using 10 days of transport time past the release event and historical data for detection limits. The curves are derived from Fig. 3-1 of WP.224

The performance of 39- and 79-station networks for the detection of 140Ba are shown in Table 5 for a large range of release magnitudes. The 39-station network uses the 39 stations with existing or planned noble gas systems (CTBTO PrepCom, 2020). The historical average 140Ba MDC was used for 72 of the stations. The stations without operating systems were assigned the MDCs from other stations currently operating on the same continent. Details of the assignments of the MDCs are provided in Appendix 1. The performance statistics in Table 5 are averaged over a year for the entire globe. The same statistics can be evaluated for shorter time periods, such as a day, to give an idea of the temporal variation in detection capabilities. The ± values represent an approximate 95% uncertainty range on daily D(M), N(M), and S(M) values.

Table 5 Detection performance for 140Ba for different levels of release for two network sizes

The network coverage varies in space as well as in time. The network detection probabilities for 39-station and 79-station networks are shown in Fig. 7, assuming a release of 1015 Bq of 140Ba anywhere on the globe. Detection limits for each station were derived from historical measurements as described in Appendix 1. The average number of stations that would detect a release of 1015 Bq of 140Ba anywhere on the globe for 39- and 79-station networks are provided in Fig. 8. This result was expected when the IMS system was designed (IMS Expert Group, 1995) but can now be evaluated through extensive transport simulations.

Fig. 7
figure 7

Network detection probabilities for 39-station (upper panel) and 79-station (lower panel) networks assuming a release of 1015 Bq of 140Ba anywhere on the globe. Detection limits for each station were derived from historical measurements. This activity level roughly corresponds to 100 tons of fission yield

Fig. 8
figure 8

Average number of stations that would detect a release of 1015 Bq of 140Ba anywhere on the globe for 39-station (upper panel) and 79-station (lower panel) networks. Detection limits for each station were derived from historical measurements

The overall network performance of the 79-station IMS RN network for a release of 1015 Bq of 140Ba is remarkably close to the historical estimate. Even with the unexpected (in 1995) impact of 212Pb on the 140Ba detection limit, the average MDC across 79 stations is 9.92 µBq/m3, while the design basis was 10 µBq/m3. As shown in Figs. 7 and 8, coverage in the equatorial regions is poorer than at higher latitudes. This result was expected when the system was designed (IMS Expert Group, 1995) but can now be evaluated through extensive transport simulations. Increasing the number of noble gas stations from 39 to 79, as is allowed in the treaty after entry into force, improves D(M), but it nearly doubles N(M) and S(M). Having more detecting stations and samples is important for improving the accuracy of the estimated release location from the sampling data, as shown in Eslinger and Schrom (2016). The maps of the number of detecting stations calculated for this analysis are similar to those produced by (Werzi, 2009) for a few months.

3.2 Network Detection Performance for 131I

The performance of 39- and 79-station networks for the detection of 131I are shown in Table 6. The 39-station network uses the 39 stations with existing or planned noble gas systems (CTBTO PrepCom, 2020). The historical average 131I MDC was used for 72 of the stations. The stations without operating systems were assigned the MDCs from other stations currently operating on the same continent. Details of the assignments of the MDCs are provided in Appendix 1. The performance statistics in Table 6 are averaged over a year for the entire globe. The ± values represent an approximate 95% uncertainty range on daily D(M), N(M), and S(M) values.

Table 6 Detection performance for 131I for different levels of release for two network sizes

The network detection probabilities for a 79-station network are shown in Fig. 9, assuming a release of 1011 Bq of 131I anywhere on the globe. Detection limits for each station were derived from historical measurements as described in Appendix 1. The decrease in the detection probability when the station detection limits are raised to the 95th percentile of historical detections at the station is also shown in Fig. 9.

Fig. 9
figure 9

Detection probabilities for a 79-station network for a 1011 Bq of 131I anywhere on the globe (upper panel). The decrease in the detection probability when the station detection limits are raised to an anomaly level equal to the 95th percentile of historical detections at that station is shown in the lower pane. For example, the coverage in a region with a bright red contour has an absolute decrease in D(M) of around 0.1, which could correspond to as much as a 40% decrease in the detections in that local region

The network-level results in Table 6 and the spatial coverage in Fig. 9 show that the coverage of the IMS radionuclide network is poor for a nominal release of 1011 Bq of 131I, especially in the equatorial regions. As shown in Table 6, the percent decreases in detection probabilities due to background sources of 131I are nonnegligible, but they are more pronounced for a 39-station network than for a 79-station network. Also, next generation aerosol samplers currently under development (Miley et al., 2019) may have lower MDCs and shorter sampling periods. These changes would improve the detection capabilities, including providing more samples with concentrations above the detection limits.

Medical isotope production facilities that release some 131I to the atmosphere are located in Argentina (South America), in China, and in the Russian Federation. The fourth region with elevated background levels of 131I is in Panama, Central America. There is no known medical isotope production near this station, but exhalation of 131I by multiple patients treated with 131I for medical procedures (Gründel et al., 2008) near the sampler might release enough 131I to result in occasional detections (Miley et al., 2021).

3.3 Network Detection Performance for 133Xe

As of the start of 2021, only 25 locations had xenon systems certified for operation in the IMS, and these fall into three system types (Dubasov et al., 2005; Fontaine et al., 2004; Ringbom et al., 2003) with different performance characteristics. Many of the certified systems have processed thousands of samples over several years and thus have a well-established background history, such as that in Fig. 13 of the Appendix. This history indicates that presence of the analyte of interest, 133Xe, is the dominant monitoring challenge for many locations.

The network performance estimates for 133Xe use modeled concentrations from nuclear power plants and medical isotope production facilities to determine the 95th percentile anomaly level. As shown in Appendix 1, the 95th percentile of modeled 133Xe concentrations can be close to the measured values for some stations. The results are similar to that of Achim et al. (2016) and Schoeppner and Plastino (2014). In addition, Gueibe et al. (2017) compared modeled and measured concentrations in the IMS for four xenon isotopes. An action threshold is then created using the maximum of either the MDC for the system presumed to operate at that station or the 95th percentile of the estimated background.

The performance of 39- and 79-station networks for the detection of 133Xe are shown in Table 7. Published MDCs are used for currently deployed sampling equipment. Details of the assignments of the MDCs are provided in Appendix 1. The columns in Table 7 with a Δ in the header show the percent decrease in performance when the MDC is set to the maximum of the equipment level and the 95th percentile of the modeled background concentrations. The performance statistics in Table 7 are averaged over a year for the entire globe. The ± values represent an approximate 95% uncertainty range on daily D(M), N(M), and S(M) values. Increasing the MDC to an anomaly threshold based on the background levels causes significant reductions in detection performance, especially for lower release magnitudes. Adding additional sampling locations improves the overall performance but does not mitigate effects of background 133Xe.

Table 7 Detection performance for 133Xe for different levels of release for two network sizes

The estimated performance of a 100-station network given in Fig. 3-3 of WP.224 was about 0.2 for a release of 1015 Bq, assuming a detection limit of 1 mBq/m3. Our calculations of the detection probability for the smaller 39-station network for a release of 1015 Bq is 0.7, much better than the historical performance estimate. We calculated a D(M) of 0.9 for a 100-station network formed by adding another 21 stations to the existing radionuclide stations while assuming a 1015 Bq release of 133Xe and an MDC of 1.0 mBq/m3 for every sampler.

The network detection probabilities for a 39-station network are shown in Fig. 10, assuming a release of 1014 Bq of 133Xe anywhere on the globe. The 1014 Bq release level was chosen mostly because two of the tests conducted by the DPRK (Murphy et al., 2013; Ringbom et al., 2009, 2014) may have had releases on the order of 1014 Bq of 133Xe.

Fig. 10
figure 10

Detection probabilities for a 39-station network for a 1014 Bq release of 133Xe anywhere on the globe (upper pane). Change in the detection probability when the station detection limits are raised to an anomaly level equal to the 95th percentile of modeled background (lower pane)

Detection limits for each station were derived from equipment specifications and modeled backgrounds from nuclear power plants and medical isotope production facilities as described in Appendix 1. The decrease in the detection probability when the station detection limits are raised to the 95th percentile of modeled backgrounds is also shown in Fig. 10. Using the same assumptions, the network detection probabilities for a 79-station network are shown in Fig. 11.

Fig. 11
figure 11

Detection probabilities for a 79-station network for a 1014 Bq release of 133Xe anywhere on the globe (upper pane). Change in the detection probability when the station detection limits are raised to an anomaly level equal to the 95th percentile of modeled background (lower pane)

The network-level results in Table 7 and the spatial coverage in Fig. 10 show that the coverage of the IMS radionuclide network is poor for a nominal release of 1014 Bq of 133Xe, especially in the equatorial regions. Following entry into force of the treaty, the addition of more IMS RN station locations would improve the coverage. As shown in Table 7, the percent decreases in detection probabilities due to background sources of 133Xe are significant over large regions of the globe.

New generation noble gas samplers under development (Haas et al., 2017; TBE, 2020; Topin et al., 2020) and deployed in the IMS (Ringbom et al., 2017) and have lower detection levels and shorter sample collection periods than current systems. These new systems will improve the detection capabilities, especially in the number of detected samples.

4 Discussion

The results in this paper are based on atmospheric transport simulations that used 10 days of transport after the release events. These transport runs were performed in the reversed-time direction for computational convenience. Detection probabilities would likely increase, especially for larger magnitude releases, if longer transport times were used.

Using the values in Appendix 1, the average MDC of 131I is 3.34 µBq/m3, which is lower (better) than the average MDC of 140Ba at 9.92 µBq/m3. Thus, 131I has a somewhat higher probability of being detected at low release magnitudes than 140Ba. The average MDC for 133Xe is much poorer at 0.231 mBq/m3 (or 231 µBq/m3), thus it has a lower probability of detecting smaller magnitude releases. However, the release of 133Xe from a small underground nuclear explosion may be larger than the releases of 131I or 140Ba. In addition, atmospheric transport of 133Xe is not affected by atmospheric loss processes, such as dry or wet deposition, that reduce the 131I and 140Ba concentrations.

The network performance for different release magnitudes of 140Ba, 131I, and 133Xe is discussed in Sect. 4.1. The stations mode impacted by background are discussed in Sect. 4.2. In Sect. 4.3, a hypothetical release scenario is discussed that would combine data from the aerosol and noble gas networks. Finally, the implications of other background thresholds is discussed in Sect. 4.4.

4.1 Network Performance for Different Release Magnitudes of 140Ba, 131I, and 133Xe

A summary of network performance for different release magnitudes of 140Ba, 131I, and 133Xe is provided in Fig. 12 using current equipment detection characteristics that are influenced by background. Although results for iodine, barium, and xenon are shown on the same graph, the data in Table 1 suggest that their release magnitudes may be quite different in a real event.

Fig. 12
figure 12

Summary of network performance for different magnitude releases of 140Ba, 131I, and 133Xe using current equipment characteristics and measured or estimated background anomaly levels. The two curves for 133Xe are for 39-station and 79-station networks. The detection probabilities, D(M), are shown in the top panel. The number of stations detecting the release, N(M), are shown in the middle panel. The number of samples detecting the release, S(M), are shown in the bottom panel. All results are for 10 days of atmospheric transport following the release event

The results shown in Fig. 12 cover 7 orders of magnitude of releases, from 106 to 1016 Bq. Using the rough scale that 1 kiloton of fission equals 1023 fission atoms, applying fission yields and a half-life of around a week, this scale can be roughly translated to covering the nuclear explosion yield in the atmosphere from 100 g to 1 kiloton. The low end of the magnitude range is included to show that a sparse network of current sensitivity has little chance of detecting an extremely small release, say, 1010 Bq or 1 kg equivalent explosion, unless the release were to occur close to a measurement system and directly upwind of it. At 1013 Bq release, which corresponds roughly to one-ton equivalent of fission in the atmosphere, the detection probabilities for aerosols are quite good (~ 75%) and for xenon are still considerable (~ 45%).

Estimates of releases from the Fukushima nuclear power plants in 2011 for 133Xe (Eslinger et al., 2014a) and 131I (Koo et al., 2014) of around 1019 Bq and 1017 Bq, respectively are larger than the upper end of the range in Fig. 12, perhaps corresponding to a megaton equivalent release of xenon and 10 kiloton equivalent release for iodine, and were detected all across the northern hemisphere (Biegalski et al., 2012). Of more interest, from a treaty monitoring perspective, is the ability to detect a test with a small yield, such as the tests conducted by the DPRK (Ringbom et al., 2009, 2014). The 2006 test yield was probably on the order of 1 kiloton (Murphy et al., 2013), while the yield of the 2013 test was a little larger in the 2.0–4.8 kiloton range (Murphy et al., 2013). Measured 133Xe and 131mXe for the 2013 test suggest a release for both tests on the order of 1014 Bq of 133Xe. No aerosol samples detected 131I, so the release magnitude can be bounded, but not be estimated.

There are fewer industrial sources of 131I than 133Xe, although occasionally there are accidental releases, such as in Hungary in 2011 (Tichý et al., 2017), not at production facilities. Thus, as seen in Fig. 9, the background interference is regional in scope, with stations in China and the Russian Federation being most impacted. The estimated background for 133Xe is global in scope, as seen in Fig. 11.

In the data in Table 6 for 131I, and Table 7 for 133Xe, the ∆D(M), ∆N(M), and ∆S(M) columns show that background interference degrades network performance more for small design basis magnitudes than for larger magnitudes. The tables also show that denser networks mitigate, to some extent, the degradation in detection performance from background emitters. However, network performance estimates need to rest on realistic release magnitudes. For example, analysis of data following the DPRK nuclear explosive test in 2013 yielded 133Xe concentrations that would have been indistinguishable from background without the concurrent detection of 131mXe (Ringbom et al., 2014).

4.2 Stations Most Impacted by Background

The data given in Tables 9 and 10 show the background levels of 131I and 133Xe that impact each station. The data there can be used to rank the stations that are impacted the most by background. Table 4 shows the stations most impacted by background levels of 131I. Table 8 shows the 12 stations most impacted by the background levels of 133Xe. The stations are ranked by the 95th percentile anomaly level, but the median concentrations are also given. The median value is at or below the MDC for 7 of the 12 stations.

Table 8 The 12 stations most impacted by the background concentrations of 133Xe

The number of stations reporting and the number of detecting samples are also important metrics and were not considered in WP.224. Along with all the other assumptions made to compute results shown here, a tacit assumption is that all stations are in good working order, and do not suffer power outages or other issues when needed. Thus, it could be considered crucial to have more than one station detecting. The lack of a second detecting station does not weaken any detection results obtained, but the network is more robust in multiplicity. Similarly, the number of detecting samples add confidence in the result. But multiple detections in time and space can be used with atmospheric transport modeling to limit the size of the region in which the signal originated (Eslinger & Schrom, 2016; Eslinger et al., 2019).

4.3 Hypothetical Release Scenario Combining Aerosol and Noble Gas Networks

It is probably unwise to make an assumption about how radioactive material will be released from an underground nuclear test: the release pathway through the geologic containment, if any, may include a complex of wet or dry fractures, while the engineered containment may include filled tunnels and sealed doorways. These could combine in many ways to produce the various results found in Schoengold et al. (1996), including frequently no measured release at all. But it can be enlightening to create a hypothetical release scenario that elucidates how the xenon and aerosol networks could work together.

Let us hypothesize a 1 kiloton equivalent nuclear explosion with a release of xenon similar to that described in Ringbom et al. (2009). In this scenario, about 1% of the 133Xe would be released, or 1014 Bq, after 3 days of decay chain ingrowth. The reader can choose from Fig. 3 some combination of ingrowth time and containment suppression that achieves a 1014 Bq release. Ely et al. (2021) estimate for U.S. underground nuclear explosive tests that an average 131I leakage would be about one part in 105 or less, but without a timing estimate for when the release occurs. Using the approximation that 1023 radioactive atoms are created in a 1 kiloton test, a release fraction of 10–5 or slightly higher, the cumulative yield of 131I from Table 1, and a decay constant on the order of 10–6 s−1, one obtains an approximate 1011 Bq release of 131I. In this scenario, there is only a factor of 1000 separating the activity of xenon and iodine.

If an explosion were to release 1014 Bq of 133Xe and 1011 Bq of 131I, then from Fig. 12, assuming 79-station networks for all samplers, the detection probability, D(14), is a very strong 0.76 for 133Xe and D(11) is a poor, but not hopeless, 0.25 for 131I. The number of samples with detections, S(11), is 0.52 for 131I, but S(14) is a robust 6.4 for 133Xe. By comparison, the DPRK nuclear explosive test in 2013 resulted in 133Xe detections at 2 stations in a total of 5 133Xe detects (Ringbom et al., 2014), which adds some confidence that these modeling results are valid. Further, because D(11) for 131I is only 25%, the lack of IMS measurement results cannot rule out the hypothetical release scenario.

Because the 79 noble gas systems have a much better chance of detecting a leaking test than the 79 aerosol systems, it might be tempting to wonder if the aerosol network is needed. If, however, there were half as many xenon systems, say 39 vs 79 and thus, half as many detecting xenon samples, adding one or two iodine detections would materially increase the confidence in the detection. If the MDC for 131I were improved by an order of magnitude as suggested in Miley (2019), then for regions that do not have a serious 131I background issue, D(M) could improve to 0.54 and S(M) could improve to 2.0. It has not been mentioned to this point that the IMS also contains a network of 16 laboratories for the confirmatory remeasurement of aerosol samplers. Several of these are equipped with ultra-low background detectors, and it has been shown that these can obtain an order of magnitude sensitivity increase for 131I (Aalseth et al., 2009). This hypothetical scenario implies that sending the aerosol samples collected nearby the detecting xenon samples for ultra-low background measurement at IMS laboratories might yield additional evidence for or against the hypothesis that the source was a nuclear explosion.

While the scenario posited here is no more likely than others, it shows that the aerosol network plus its laboratories could substantially contribute to detecting small release underground nuclear test events now, and with station improvements, could make those contributions in near real-time. Another observation, without prejudice for the release mechanism, is that iodine, barium, and presumably other aerosols constitute a very sensitive detection means for small magnitude sources in the atmosphere that are orders of magnitude smaller than the detection measurement range of current xenon technology.

4.4 Other Possible Background Thresholds

If background estimates could be made sufficiently accurate, the 95th percentile could be adjusted temporally to lower the action threshold significantly. Approaches to accomplish this might include:

  • Additional noble gas background measurement campaigns at IMS locations currently without a noble gas sampler. This would provide sampled data very useful for verifying and improving the modeled global concentrations. The change in network detection probabilities, ∆D(14), shown in Fig. 11 could be used to help select measurement locations with significant background concentrations.

  • Besides IMS locations, the addition of local monitoring data, for example from local networks, safety systems, or stack monitors could greatly influence background calculations and potentially improve background estimates.

Another avenue of mitigation of the background radioactivity impact is to make the aerosol and xenon systems more supportive of each other. One order of magnitude additional iodine sensitivity in the aerosol network could allow 131I detections to occur in support of xenon detections, raising the confidence and location capability of the combined network substantially. There are six stations on both the most impacted aerosol and xenon lists in Tables 4 and 8. These stations and the regions around them, RN01, RN20, RN21, RN54, RN59, and RN61, might be prime candidates for a noble gas background measurement campaign to understand the sources and find ways to improve the action threshold. Such a field campaign could further help develop the concepts of fusion between aerosol and xenon networks.