Introduction

The quantitative assessment of risks after exposure to ionizing radiation is currently based on knowledge gained from epidemiological studies of radiation-exposed populations, complemented by data from animal experiments, as well as fundamental in vitro cellular and biophysical studies. While all the sources of information are important, direct assessment of the risk for cancer and non-cancer diseases is only obtained from epidemiological studies. After the first detonations of nuclear bombs over Hiroshima and Nagasaki, Japan, a number of populations have been exposed to radiation from nuclear bomb testing. Studies on those populations, in addition to studies on the Japanese atomic bomb survivors, are considered as an important source to learn about health effects of low to medium doses (Gilbert et al. 2002). These populations comprise test participants as well as members of the public. Until the limited test ban treaty of 1963, atmospheric nuclear tests were conducted by then nuclear-capable states (USA, USSR, the UK, and France). While USA, USSR, and the UK halted their atmospheric testing activities after the signing of the treaty, France, however, continued atmospheric tests until 1974. China conducted atmospheric tests at the Lop Nor test site from 1964 to 1980. In recent years, other countries also conducted nuclear tests, namely Pakistan, India, and North Korea, but all these were underground tests (UNSCEAR 2000; Grosche 2013).

Among the groups of the general population exposed to ionizing radiation following nuclear weapons testing were those living near the Nevada Test Site, USA, the population of the Marshall Islands and of French Polynesia in the Pacific Ocean, and the population living near the Semipalatinsk Nuclear Test Site (SNTS), Kazakhstan. The SNTS was the largest test site of the Soviet Union and had the largest number of residents near the test site who could be potentially exposed to radioactive fallout.

In 2002, studies on residents of the areas surrounding the SNTS were listed as potentially informative for obtaining new evidence on radiation-induced health risk (Akleyev et al. 2002). Consequently, the Ministry of Health of the Republic of Kazakhstan set up a registry to continue the long-lasting efforts to collect and collate information on the health status of the population exposed from the testing (Apsalikov et al. 2016). The aim of the present paper is to give detailed information on this registry which is held by the Scientific Research Institute for Radiation Medicine and Ecology (SRIRME) in Semey (previously: Semipalatinsk), Kazakhstan, and to emphasize its potential value for future radiation health research. To date, only a few studies have been conducted which were either completely (Bauer et al. 2005; Grosche et al. 2011) or partially (Land et al. 2008, 2015) based on information obtained from a database at SRIRME (Katayama et al. 2006).

The Registry

Background

The former Soviet Union’s Semipalatinsk nuclear test site (SNTS) is located in the present East Kazakhstan oblast (region) of Kazakhstan. The test site is named after the city of Semipalatinsk (in Kazakh: Semey) and is situated approximately 150 km to the west of the city. The SNTS covers an area of 18,500 km2 or 7143 square miles.

The SNTS was the location where the USSR conducted their first nuclear bomb test on August 29, 1949. That test was a replica of the first U.S. nuclear device, Trinity, because information leaked from the Manhattan Project in the U.S. During the 40 years following that test, 456 nuclear detonations were carried out, including 111 atmospheric tests (86 events in air and 25 surface events) between 1949 and 1962 (Mikhailov 1996; Nugent et al. 2000). The total yield of nuclear devices tested in atmospheric detonations and conducted at the SNTS is reported to be 6.58 megatons of TNT equivalent, which corresponds to approximately 2/3 of the total estimated yield of Soviet atmospheric bomb tests (Dubasov et al. 1994).

After the ratification of the limited test ban treaty in 1963, the detonations at SNTS were restricted to underground shafts and tunnels. With a few exceptions, little or no offsite environmental contamination resulted. The last detonation conducted at the SNTS was on 19 October 1989.

The SNTS had three major testing areas: Ground Zero, where the atmospheric bomb tests were performed; the Degelen mountains, where more than 200 underground nuclear explosions occurred; and the Balapan area, where 123 underground tests were conducted. During the period of nuclear testing, access to the site was strictly controlled by the Soviet armed forces, and no civilian access or use of the area was permitted.

Atmospheric nuclear tests conducted from 1949 until the limited test ban treaty of 1963 caused considerable radiation exposures in the proximity of the test site (Bauer et al. 2013). The most damaging tests, in terms of exposure of the local population, were those conducted on August 29, 1949 [with a yield of 22 kilotons (kt) TNT equivalent], September 24, 1951 (38 kt), August 12, 1953 (400 kt), and August 24, 1956 (27 kt). The other detonations led to exposure primarily within the confines of the testing ground and not beyond the site (Grosche et al. 2016), though traces were detected outside (see Fig. 1).

Fig. 1
figure 1

Trajectories of the radioactive clouds related to the most important nuclear tests, and location of some of the settlements of interest (Altai Krai = Altai region) (Gordeev et al. 2002)

The population living closest to the SNTS was exposed to relatively high levels of radiation. To date’s most evolved dosimetry system estimates external whole-body exposures up to about 300 mGy. Further details on dosimetry are given below. Settlements affected by the 1949 test were located north-east of the SNTS (e.g., Dolon and Cheremushka), but traces from this test have also been documented in the Altai region in Russia (Shoikhet et al. 2002). The tests of 1951 and 1953 mainly affected settlements south and south-east of the SNTS (e.g., Kainar, Karaul, Kaskabulak, Sarzhal, and Znamenka). The 1956 test affected settlements to the east, including the city of Ust-Kamenogorsk (Apsalikov et al. 2016). Figure 1 shows the trajectories leading to significant exposures of the general population (Gordeev et al. 2002).

The August 1956 test was the most important in terms of its role in starting health surveillance of the potentially affected population. To systematically monitor the radiation situation and the health status of residents of the contaminated areas, ‘Dispensary No. 3’ was established in Ust-Kamenogorsk, but a short time later was relocated to Semipalatinsk and, in 1957, it was established as ‘Dispensary No. 4’ (Apsalikov et al. 2016). Dispensary No. 4 was a classified medical institution performing health follow-up and treatment, collecting vital and health data on the exposed population in the former Semipalatinsk oblast (which is now a part of the East Kazakhstan oblast). More recently, it has also started to collect and store biological material for possible future analyses of long-term health consequences.

In the early 1960s, Dispensary No. 4 was ordered by Soviet authorities to conduct a long-term study on health effects in the exposed population of the Semipalatinsk oblast under supervision of the former Institute of Biophysics in Moscow, Russia. The target population, participants’ recruitment and data sources are described below.

The Scientific Research Institute for Radiation Medicine and Ecology (SRIRME), founded in 1991, is the successor of Dispensary No. 4 and inherits its formerly classified health data archives. SRIRME also has a biological laboratory, a small clinic and a mobile medical unit. The analysis of the health data was continued, and further data collections are ongoing (see also http://de.slideshare.net/IRPslideshare/inst-radiology-med).

In 2013–2016, the European Union funded a study SEMI-NUC (Kesminiene 2016) to assess the feasibility of establishing a long-term, prospective cohort to study the health effects of low and moderate radiation exposures that resulted from the testing of nuclear weapons at the SNTS. The project brought together scientists from Europe, Kazakhstan, Japan and the USA with a common interest in the health effects from exposure to low doses of ionizing radiation. One of the aims of the project was to test the feasibility of creating a larger cohort than the one used in previous analyses (Bauer et al. 2005; Grosche et al. 2011) for future long-term studies. This aim was initially based on information collected in a Kazakh–Japanese activity (Ogiu et al. 2008; Yoshinaga et al. 2018). However, during the course of the European-based SEMI-NUC project, the comprehensive data base held by SRIRME was discovered and its potential use for future epidemiological research evaluated. Mechanisms of follow-up of populations covered by the registry and application of the U.S./Russian joint dosimetric methodology (Simon et al. 2006) to reconstruct doses for populations concerned were also evaluated. Based on the results of the SEMI-NUC project, an overview of the Registry, its content and how it works is presented here.

Setting up the Registry

The State Scientific Automated Medical Registry of the Kazakh population exposed to ionizing radiation, which in this text is called the “Registry”, was established as a part of SRIRME’s scientific program and approved by the Kazakh Ministry of Health in 2003. The catchment area of the Registry is the East Kazakhstan oblast (see Fig. 2); it also includes the potentially exposed districts of the Pavlodar and Karaganda oblasts further to the west of the SNTS. The Kazakh governmental commission assigned each administrative district of the above-mentioned oblasts to one of four radiation risk zones based on information about two major detonations in 1949 and 1953: zone of extreme radiation risk; zone of maximum radiation risk; zone of increased radiation risk; zone of minimal radiation risk (Bauer et al. 2013; Gusev et al. 1997) (see Fig. 2). The main tasks when setting up the Registry included:

Fig. 2
figure 2

Map of different radiation risk levels according to Kazakh legislation; East Kazakhstan oblast: districts 1–15 plus two in the east (Katon-Karagay und Kurchum)

  • Registration of the population adjacent to the SNTS attributed to the radiation risk groups, based on the knowledge about the radiological situation in previous years;

  • Organization of medical care and of measure to reduce social stress following the radiation exposure for the population in the affected areas of Kazakhstan at the national and regional level;

  • Medical health monitoring of registered individuals by conducting medical check-ups either at SRIRME or with its mobile unit;

  • Continuous follow-up of all individuals included in the Registry and integrating information on the health status coming from other sources in the database.

In August 2002, a cooperation between the Radiation Effects Research Foundation (RERF), Hiroshima, Japan, and SRIRME enabled methodological, scientific and practical assistance in creating and establishing the computer-based Registry (Katayama et al. 2006). The database for the Registry was developed using a similar database schema as used at RERF, and also using the experience from setting up the so-called Kazakh “historical cohort” (Bauer et al. 2005; Grosche et al. 2011). Since 2007, the Registry is continuously updated with demographic and medical information (Bauer et al. 2013; Vakulchuk et al. 2014). The overall number of individuals included in the Registry as of 1 November 2017 was 351,506, broken down by date of birth and follow-up status in Table 1. Among the registered persons, 49.7% were male and 50.3% were female.

Table 1 Individuals in the Register, broken down by date of birth and vital status (as of 01 November 2017)

Target population and sources of information

The Registry’s target population are individuals who are or were living in areas contaminated by radioactive fallout from the Soviet nuclear bomb testing at SNTS as well as their children and grandchildren.

The status of being exposed is documented in a certificate confirming the right for compensation benefits. In case of its absence, it is substituted by a document confirming the fact of residence in the affected territory.

Sources of information for the Registry are household books, administrative data archives in Semey city and in regional centers, and death certificates from the Civil Acts’ Registration Office (ZAGS) archives. Additionally, the Registry also keeps medical records from patients’ medical charts and case histories, screening examinations and other specific examinations, both from inpatients and outpatients, where an inpatient is admitted to a hospital while an outpatient receives ambulatory medical care which can take place even outside the premises of a hospital, e.g., by a mobile unit. Further sources of information to verify or complete information on the time and place of residence in the study area include: interviews of farmstead residents; personal documents confirming residence in the study area; logs for registration of emigrated and immigrated persons; and information from governmental agencies (regional centers for pension payment, archives of the national registry office) and city and regional health care institutions.

In 2007, the Republic of Kazakhstan introduced an individual identification number (IIN) for every Kazakh citizen. Since 2013, SRIRME is allowed to use this identifier which enables expansion of the follow-up to the entire population of Kazakhstan. SRIRME is therefore able to trace individuals who left the original catchment area, i.e., East Kazakhstan, throughout the entire rest of the country. The number of individuals with known IIN in the registry is 96,477. It is anticipated that this fact will reduce the percentage of individuals with the follow-up status of ‘migrated’ or ‘unknown’ (see Table 1), both for retrospective and for prospective studies.

Since March 2012, the national law obliges all Kazakh regional diagnostic centers and the Republican cancer registry to send information on diagnoses of exposed people and their offspring to SRIRME. The fact that a person was affected by the atomic bomb testing is indicated in official documents.

Content of the Registry

Each individual included in the Registry is assigned a unique identification number that allows to access all available information. The following individual information is available: passport information, nationality, radiation route (places of residence during the nuclear test period), vital status, cause of death for deceased, education, occupation, number of official documents confirming status of being exposed, and results of the comprehensive medical examinations conducted by the SRIRME staff. Further, the Registry contains information about the descendants of exposed persons—their children, grandchildren and great-grandchildren. For the descendants, the same information is collected as for the exposed persons. An overview on the information kept in the Registry is given in Table 2. Among the 351,506 individuals of the Registry 125,337 (35.6%) are deceased. Among these, 51,061 (40.7%) died from diseases of the circulatory system, 28,775 (23.0%) from malignant neoplasms, and 14,130 (11.3%) from diseases of the respiratory system. Additionally, the register contains information on lifestyle factors (smoking, alcohol) for the 8400 individuals of the three-generation study described below (Apsalikov 2016; Apsalikov et al. 2017). For other individuals, such information can be abstracted from paper records.

Table 2 Information kept in the State Scientific Automated Medical Registry of the Kazakh population exposed to ionizing radiation

To study possible health effects among the descendants, the population in the Registry is divided into three groups according to their dates of birth (see Table 1). In 2014, with support from the EU funded DoReMi project (Salomaa et al. 2015), SRIRME began collecting information for 8400 individuals from the Registry setting up the basis for a three-generation study. The following data were abstracted: (a) registration data: individual’s ID, sex, nationality, date and place of birth; (b) medical information (where applicable): established diagnoses and their dates, presence of congenital malformations, and for deceased individuals date and cause of death; (c) dosimetry data: radiation route, radiation dose (based on doses assigned by Kazakh legislation, which can differ from the doses estimated by other methods); (d) information on lifestyle factors: smoking, alcohol; (e) availability of biological samples: blood, DNA, tumor and normal tissues (Apsalikov 2016; Apsalikov et al. 2017).

Exposure levels

During 1962–1990, radiation safety specialists of the SNTS and of Dispensary No. 4 calculated radiation doses for approximately 67,000 people living in the territories adjacent to the test site. The dose calculations were based on radiation parameters in 103 exposed settlements (Apsalikov et al. 2016). A first exposure mapping effort was conducted in 1991 using data from soil samples collected in the 1960s and subsequently measured for residual radionuclide activity.

The effective dose for each registered person was calculated according to a method developed at SRIRME (Kurakina et al. 2000). That method takes into account the four areas of radiation risk as shown in Fig. 2.

An extensive comparison of various methods for dose assessment was conducted during a conference held in 2005 (Stepanenko et al. 2006). The most evolved dosimetry system to date is based on a joint U.S.–Russian dose reconstruction methodology developed by combining experience of dose reconstruction experts in the two countries (Beck et al. 2006; Gordeev et al. 2006a, b; Simon et al. 2006). Generally, the dose estimates based upon this system are lower than those kept by the Registry. For the first analysis of the so-called historical cohort (Bauer et al. 2005), dose estimates were based on the SRIRME system. The average cumulative dose estimate 1949–1960 for those living close to the SNTS was estimated to be about 630 mSv with a maximum of slightly over 1.7 Sv for the study participants living in Cheremushka. The maximum individual dose was reported for the village of Dolon with more than 4 Sv. These estimates are cumulative doses from 1949 to 1960 and include both internal and external exposure. Later, maximum external doses for an adult having been exposed in Dolon were estimated as 1.3 Sv (Gordeev et al. 2002) or 0.63 Gy (Simon et al. 2006). When applying the joint U.S.–Russian methodology to the members of the historical cohort, external doses ranged from 0 to 0.3 Gy with the maximum for an adult being exposed in Dolon (unpublished result of the SEMI-NUC project).

The health studies mentioned above used different methods for dose assessment. The joint U.S.–Russian dose reconstruction methodology was applied in the studies on thyroid nodules in an exposed and non-exposed population (Land et al. 2015, 2008). In that work, doses were reconstructed from fallout deposition patterns, residential histories, and data from individual interviews on diet, in particular, on consumption of fresh milk and other dairy foods during childhood. Individual external and internal doses to the thyroid averaged 0.04 Gy (range 0–0.65) and 0.31 Gy (0–9.6), respectively (Land et al. 2008).

The SEMI-NUC project (Kesminiene 2016) defined which steps have to be taken to enable assessment of average settlement specific and individual doses for future epidemiological studies. A previous inter-comparison exercise conducted for the Dolon settlement (Stepanenko et al. 2006) demonstrated that the majority (more than 95%) of the whole lifetime external dose received by the residents of the contaminated villages was received during the first year following the test. Therefore, to calculate the external cumulative dose, it is generally sufficient to take into account residence history and lifestyle of the residents during the first year following the date of the relevant explosion. A procedure for reconstructing average settlement dose including 15 well-defined steps was developed. Further steps to derive individual doses were also defined (Kesminiene 2016).

Long-term health effect studies

A number of studies have been conducted since 1962, focusing on different health outcomes and biological endpoints. A detailed overview of epidemiological studies is given elsewhere (Grosche et al. 2015). Here, only studies are mentioned which are relevant to the context of this paper. A cohort study of mortality rates performed in the period 1960–1999 demonstrated statistically significant high risks of all-cause mortality (risk ratio RR = 1.83), site-specific cancer mortality for esophagus (RR = 3.29), stomach cancer (RR = 2.29), lung cancer (RR = 2.77), and female breast cancer (RR = 1.85) (Bauer et al. 2005). The highest radiation-related mortality risks were recorded among persons aged 0–19 and 20–39 years old at the time of exposure (Bauer et al. 2005).

In the same cohort, cardiovascular diseases were diagnosed in the exposed people significantly more often than in the unexposed control group living further to the east of the SNTS (RR = 2.27) and in the population of Republic of Kazakhstan (RR = 2.25) (Grosche et al. 2011). Further investigations looked at hypertension and obesity (Markabayeva et al. 2015a, b). A statistically significant increased prevalence of cancer (RR = 2.08) was described for the offspring of exposed people, and the authors speculate that this might be radiation related and ask for further detailed studies (Glushkova et al. 2015).

A large cross-sectional study conducted in 1998 suggested a radiation-related risk for thyroid nodules (Land et al. 2008, 2015). The study population were villagers in northeastern Kazakhstan exposed as adolescents to radioactive fallout from nuclear testing. Accordingly, the age of the 2994 study subjects at the time of the main fallout event for the village of residence varied between preconception and 15 years of age. Another study on thyroid disorders targeted those seeking medical examination in the various districts of the East Kazakhstan oblast  (Grosche et al. 2017). Results from medical examinations of thyroid function showed no indication for a radiation effect. Over a period of 11 years (1999–2009), 1067 study participants received 1287 examinations. Among the study participants, 456 were exposed and 577 were not. For 34 participants, the exposure situation could not be confirmed. Study participants lived either in one of four settlements close to the SNTS or in a settlement further away from the SNTS. The study group comprised children and elderly people, with a focus on those aged 45–74. A self-selection bias of study participants could not be ruled out.

In addition to the epidemiological studies, biological research has been conducted (Sigurdson et al. 2009; Takeichi et al. 2006). For example, cytogenetic studies carried out between 1962 and 1975 showed an elevated number of chromosomal aberrations and investigated DNA radiation markers (large-scale mitochondrial DNA deletions, unstable-type chromosome aberrations) in exposed individuals aged 10–60 years in comparison to a control group (Hamada et al. 2004; Tanaka et al. 2006). Possible genetic effects were studied when looking at the sex ratio amongst children born to exposed parents (Mudie et al. 2007) and at twin births among the same population (Mudie et al. 2010).

Availability of biological samples

A “Bank of biomaterials from the people exposed to ionizing radiation in the result of nuclear weapons tests at the Semipalatinsk Nuclear Test Site” was created to keep biological materials of both the exposed population and their offspring for future research. It includes individuals who lived or live in the vicinity of the SNTS. The bank contains blood samples, extracted DNA, tissue samples and teeth and is directly linked to the Registry, whereas the first generation are parents who were directly exposed from nuclear bomb testing, the second generation comprises their children who may have been exposed through residual contamination of the food chain, while the third generation comprises the unexposed grandchildren (also see Table 1). Sample selection started in 2000, and Table 3 summarizes the available number of samples per generation until 2016.

Table 3 Amount of available biological material per generation, 2000–2016

Conclusion for future studies

The SEMI-NUC project demonstrated that the SRIRME Registry is the most comprehensive data source for studies on possible health effects among the population affected by the nuclear tests at the Semipalatinsk Nuclear Test Site. The population provides a novel, mostly unexplored, and valuable resource for the assessment of the population risks associated with environmental exposures to ionizing radiation. An advantage is that the population is unselected and includes all age groups. The external dose ranges from 0 to 0.3 Gy that  makes it important for today’s topics in radiation health and radiation protection research. Internal doses were estimated by the dosimetry system used by the Registry (Kurakina et al. 2000). The joint U.S.–Russian dose reconstruction methodology focused on thyroid doses (Gordeev et al. 2006b), estimating other organ doses from internal exposures has not yet been done.

The population covered by the Registry would be the most suitable for a long-term, prospective follow-up study of the health effects from exposure to fallout in the Semipalatinsk area. There are not many populations worldwide that have been exposed to protracted low-dose radiation and that are suitable for studies on the morbidity of cardiovascular diseases (CVD). A cohort of individuals from the well-maintained and regularly updated Registry has the potential for a prospective epidemiological research on the association between radiation and CVD. In addition to CVD, a study of cancer risks among those exposed in childhood and in utero is also feasible. The total number of individuals exposed in utero can at the moment only be estimated as 2% of all individuals based on information from the so-called historical cohort.

Dose reconstruction for the population around the SNTS is challenging and has been criticized in the past for the high level of uncertainty of the doses used (UNSCEAR 2008). However, significant work has been conducted by dosimetry experts from Russia, Kazakhstan, Japan and USA to improve these estimates and to evaluate the level of uncertainty (Stepanenko et al. 2006; Land et al. 2008; Drozdovitch et al. 2011).

The SEMI-NUC project made the following specific proposals for follow-up research (Kesminiene 2016):

  • Follow-up of previous ultrasound-detected thyroid nodules: The most informative would be a study of natural history of ultrasound-detected thyroid nodules (Land et al. 2008, 2015). For each screening participant, questionnaire-based data is available on residential history, medical history and dietary habits with special attention to individuals' consumption of milk and milk products. Individual thyroid doses due to external and internal exposures were reconstructed for each study participant. This study is feasible if agreements with the data custodians are reached.

  • Pilot study on cardiovascular diseases (CVD): A pilot study on CVD is recommended. It is feasible to study stroke and CVD outcomes in the Semipalatinsk population. This is of relevance since the population near the SNTS was a high stroke risk population, at least during the first years covered by the Registry. Though it remains unclear why the CVD risk in this population is so much higher than among a control group, previous findings support a study of environmental exposures in further investigations regarding the mechanisms associated with excess CVD risks (Apsalikov et al. 2015; Grosche et al. 2011).

  • Cohort study on cancer risk following early life exposure: A cohort study on cancer risk following early life exposure including in utero exposure is suggested. After reviewing sources of information and data availability on cancer incidence, a future study of cancer incidence in relation to early life exposure is feasible. The exposed population gives a unique opportunity to evaluate effects of different ages at exposure on cancer and non-cancer disease radiation-associated risks, e.g., exposures at age range < 20 years.

  • Prospective molecular epidemiology study of cancer risks following early life exposure: The currently available biobank data on exposed individuals are not sufficient to carry out a molecular epidemiology study; but expanding the collection of biological samples will make it possible to conduct a prospective study of cancer risks following early life exposure as described above. A sustained collaboration with the Semey cancer oncology centre and the clinic-diagnostic centre to collect blood and tumor tissue samples for members of the SRIRME registry would be beneficial. A cohort built from the exposed population provides a unique opportunity to assess the epigenetics and transgenerational programming of disease risks from environmental exposures, because information is available for the second and the third generations.

In addition, a mortality follow-up study on life time health effects that started before (Bauer et al. 2005; Grosche et al. 2011) could be widened by expanding the cohort and extending the follow-up period.

Beyond radiation effects, the Registry could also serve as a good basis for studying psychological effects (Pivina et al. 2017a, b; Dyussenova et al. 2018). Between 2002 and 2012, interviews were taken from 2330 individuals who lived in 39 settlements from different zones of radiation risk (Fig. 2). The interview included questions about fear of a loud sound or bright flash, insomnia, irritation, depression and apathy. Further studies could be conducted based on known information from the Registry.

In summary, the Registry held at SRIRME offers a unique and valuable resource to further study health effects from low-dose and low-dose rate exposures. It is important to coordinate any ongoing or future research to avoid redundant work and to use synergisms from different methodological approaches.