What causes childhood leukemia? This is a common question for families facing the disease and one that must be answered for any effort at preventing the most common cancer in children. Proven causes are few, and include ionizing radiation and congenital genetic disorders, with influence from genetic susceptibility modifiers. The study of these and many other candidate causes (chemical exposures, infections, fetal growth rates, diet) has not slowed an inexorable rise in leukemia rates in the last half century at 1.3 % per year, higher in specific ethnic subgroups [1]. Some aspect of modern societies is causing this increased incidence, most likely the ways we have changed our interactions with our environment—particularly infectious agents and exogenous chemicals.

Leukemia is a rare disease, and risk factors are typically assessed in case–control study designs. These studies are drifting towards extinction; however, primarily due to the increasing difficulty in assembling suitable control populations because of barriers in recruitment practices. The use of cohort designs, which is possible with regional or national registries and databases, is a relevant and welcome practice. One gains advantages in sample size and population representativeness, but can be hampered by the lack of biological sampling and uniform pathologic classification. Several new investigations in cohort designs in this and recent issues of European Journal of Epidemiology as well as a new meta-analysis address leukemogenesis in childhood. We discuss infection, followed by environmental traffic fumes and parental age at birth in reference to leukemia epidemiology.

The natural history of the leukemia suggests that both prenatal and postnatal events are critical. The prenatal origin of both ALL and acute myeloid leukemia (AML) was proven with the discovery that clonotypic translocation and aneuploidy events present in leukemia exist at birth [2, 3]. These events occur at a higher rate than in the population at large, indicating that secondary alterations are necessary for leukemic transformation [4]. These secondary events are postnatal, and include deletions in transcription factors necessary for normal progenitor cell differentiation causing an arrest in development. The mechanistic formation of these secondary events appears to be illegitimate “off target” activity from normal cellular enzymes that provide for antibody diversity and affinity [5, 6]. Formation of high affinity antibodies is a normal event in resolving infections but overstimulation of this system via infection or overly vigorous response to infection may lead to increased risk of disease. Therefore the role of infection as a stimulant for these secondary postnatal events is quite convincing from the viewpoint of the molecular forensics within the leukemia cell. But still up to debate is whether the identity of infections matter, or simply their exposure timing and frequency. Also, the initiating events (translocations and aneuploidies, often prenatal) do not display any of these hallmarks of infection causation and precede the development of adaptive immunity, emphasizing that leukemia has a multifactorial origin.

The Kinlen population mixing hypothesis posits that recent and large migrations of new populations in rural areas can facilitate the transmission of a rare leukemic transforming virus, causing a transient increase in leukemia incidence in the recipient population [7]. The Greaves hypothesis is more general and speaks to leukemic incidence among the population as a whole: exposure to common infections early and often educates and modulates the immune system leading to good health. Leukemia, on the other hand, can result from an immune system that is relatively sheltered early in life but then overstimulated when a barrage of infections causes a cytokine storm, causing mutations in pre-initiated cells [8]. Lupatsch et al. [9] examined a version of the population mixing hypothesis using the Swiss National Cohort, and found subtle evidence that increased population mixing suppressed leukemia incidence in urban populations. Kinlen argues that the level of in-migration assessed in this study, at least in rural areas, is not extreme enough to trigger the herd infectivity of a transforming virus under the rubrics defined by his hypothesis [10]. One might argue that the Lupatsch study is more fitting as an assessment of the Greaves hypothesis, since it measures leukemia incidence averaged over hundreds of urban and rural municipalities rather than a specific scenario with extreme population mixing that Kinlen typically examines. Still, a suggestive but not significant increase of leukemia incidence in younger leukemia patients was demonstrated among rural populations with the highest levels of population mixing [9]. This is possibly a demonstration of Kinlen’s hypothesis but diluted by the amalgamation of hundreds of districts incorporated into the comparison. It is certainly interesting then that within the same Swiss cohort population, Spycher and colleagues subsequently reported on the existence of leukemia-specific clustering in an examination of the time–space clustering of births and diagnoses of all childhood cancer cases [11]. This significant clustering occurred by birth address (<2 km) and birthdate (<2 years), and suggested that cryptic outbreaks shortly after birth may have caused the clusters. This is particularly interesting when considered with the results from two other medical record-based studies that demonstrated increased medical care for infections during the first months after birth among children growing up to be cases, compared to controls [12, 13]. Specific infectious agents were not tested in any of these studies; clearly this field will move forward when the application of a space–time clustering assessment is combined with the availability of biological samples to identify specific infections. The best resources for such future studies include population-based cohort resources such as the International Childhood Cancer Cohort Consortium (I4C) in which repeated pre-diagnostic biological samples are available.

The intense interest on an infection etiology for ALL is not shared by its less incident counterpart, acute myeloid leukemia (AML). AML in children is thought instead to be a consequence of chemical exposures, analogous to the adult disease, but there is very little evidence for this mainly due to the lack of numbers of this rare diagnosis (about 15 % the incidence rate of ALL in children). Despite this, there are some examples showing that infection may also play a role in AML. For example, a history of physician-diagnosed infections was strongly associated with AML in a Taiwanese study [12]. Also, breastfeeding, commonly considered to have immunomodulatory properties, was protective for AML [14]. In a new study, Crump et al. [15], studied perinatal variables and AML risk in a large Swedish cohort. The study did not restrict by age of the patient, but consists primarily of children because of the available follow up time within the cohort. This is appropriate as the AML subtypes that occur in children occur throughout the lifespan unlike ALL. The study reassuringly confirmed known risk factors for AML including fetal growth and maternal socioeconomic status, long observed in both case–control and cohort studies. Surprisingly, the study yielded a very strong season of birth association implicating infectious exposures. Children born in winter had a 1.8-fold higher risk of AML compared to those born in summer, suggesting that either a maternal pregnancy infection or more likely an early life infection impacts risk [15]. As this was the first study to examine and demonstrate seasonality for AML within a large cohort, the result must be subjected to replication under other scenarios. Other seasonal factors apart from infection should also be considered, as pollution levels mirror the observed seasonality of leukemia [16], and AML in adults is well-known to be associated with chemical and combustion exposures. Patterns of indoor and outdoor exposure certainly vary in a country that straddles the polar and temperate climate zones and additional aspects of seasonal variable chemical exposures might be considered for further study.

In another new study, Spycher et al. [17] examine childhood cancers in relation to highway proximity, again using the Swiss National Cohort. Among all childhood cancers, time to event analyses suggested only a slight but nonsignificant increase for all cancer by proximity to highways. However, when focusing on individual cancers, leukemia and particularly those at low age of diagnosis were significantly associated with proximity to highways. This is a well-controlled study with careful consideration of other suspected risk factors, and the specificity of the result for leukemia is evidence for a true relationship and not a bias. While the sample size included 532 leukemias, subtype-specific results were included for ALL only. The hazard relationship for “all leukemia” for those <5 years of age was 2.02 and for ALL was 1.70, suggesting that the result for AML was stronger than ALL. It is worth to note that most AMLs under 5 have rearrangements in chromosome 11q23, or KMT2A [18]; such rearrangements have long been linked to exposure to chemicals which inhibit topoisomerase II, an enzyme which decatenates DNA via double strand breaks and can cause chromosomal rearrangements [19]. Topoisomerase II inhibitors have varying efficacy and are common in our diet (fruits, teas, soy) and pharmacopeia particularly among certain chemotherapy drugs. The association shown here between leukemia and highway distance was remarkably strong and confined to those in immediate proximity (<100 m) from the highway. Future studies should focus on the KMT2A-mutated leukemias (both AML and ALL) and what chemicals within traffic fumes may function behind this association.

A longstanding observation among childhood leukemias, but not fully resolved, is increased risk imparted by advanced maternal and paternal age at birth. Spontaneous chromosome abnormalities clearly increase with age at pregnancy, lending credence to the hypothesis that increased parental age may impact a disease defined by chromosome abnormalities such as leukemia. Sergentanis et al. [20], provides a new meta-analysis on this topic, including 77 studies. Advanced parental age was a risk factor for ALL but even more so for AML, and interestingly young parental age was a significant risk factor for AML. One issue unresolved is the independent effects of maternal versus paternal age (which are highly correlated) that might be resolved with a pooled analysis—and the authors point out that international consortia such as the I4C study mentioned above, is focusing on exactly this topic.

In sum, several new reports provide exciting clues on long studied candidate risk factors of childhood leukemia etiology. The most pressing need for this field of research is to understand the role of infection and immune development for a cancer of the developing immune system, as immune activation may be a modifiable target leading to approaches at leukemia prevention. Clearly many of the genetic incidents in the promotion (not the initiation) of the disease are associated with aberrant immune stimulation. Progress can also be made by incorporating specific personal exposure measurements and laboratory disease classifications into creative epidemiologic study designs, leading to construction of more comprehensive natural histories of childhood leukemia.