Introduction

Acute lymphoblastic leukemia (ALL) accounts for 25–30% of all cancers in children [1]; yet, our understanding on the etiology of the disease is rather limited [2]. Individual studies and large consortia, such as the Childhood Leukemia International Consortium (CLIC) [3], are exploring a constellation of factors related to the perinatal origins of the disease [4,5,6] including birth anthropometrics [7], early immune stimulation [8], prenatal vitamin supplementation [9], and pre-labor cesarean delivery [10, 11].

Sharply increasing trends of parents with advanced age at first delivery, have attracted scientific interest due to the reported consequences on offspring’s health [12,13,14]. Indeed, advanced maternal age has been linked to several adverse pregnancy outcomes [15] including an increase in the risk of chromosomal abnormalities in the offspring [16]. Albeit less studied, advanced paternal age has also been associated with single gene mutation birth defects, chromosomal abnormalities and neurodevelopmental disorders in offspring [17]. Genomic sequencing studies have shown higher numbers of de novo mutations in the offspring of older parents [18, 19] and decreased DNA methylation patterns [20], potentially increasing offspring vulnerability to carcinogenesis [20, 21].

In the context of the current CLIC study, a meta-analysis was undertaken [22] showing positive associations of advanced age of both parents at birth of the index child with ALL in the offspring irrespective of study design. Subsequently, registry-based, record-linkage nested case–control (NCC) studies from the US and Denmark also reported an increased ALL risk with advanced maternal age, whereas the positive associations with older paternal age were marginally significant [23,24,25,26]. Incomplete control for confounding, variable treatment of the paternal and maternal age variables collinearity between maternal and paternal age and limited power preclude, however, firm conclusion [27].

To this end, we used primary data from 16 CLIC studies conducted in 12 countries around the world to explore the association of parental age with childhood ALL. Given indications of non- representative controls selection in CC studies, resulting in potentially misleading effect estimates regarding parental age [27], we compared data from 11 CC studies with those derived from 5 population-based cancer registries linked with birth and health registries following a nested case–control design (NCC).

Methods

Study designs and availability of data

Primary data were contributed by 15 studies participating in CLIC following data transfer agreements of individual studies with the Nationwide Registry for Childhood Hematological Malignancies and Solid Tumors (NARECHEM-ST) (Supplementary Table 1). Specifically, 11 were of CC design entailing subject contact, recruitment and telephone or in-person interviews to obtain exposure and disease related information from Brazil, Costa Rica, Egypt, Germany, Greece, Italy, New Zealand, UK, US-California, US- COG-E15 and US-Texas and the additional four of NCC design with population-based linked cancer and birth/health registry data from which controls were drawn (Canada-Quebec, Denmark, Finland, Washington State).

Lastly, the Californian State NCC contributed only maximally adjusted summary estimates for the meta-analyses, and not primary data, due to regulatory constraints of the California Cancer Registry, making a total of 16 studies for analyses. Cases and controls were aged < 15 years at diagnosis/recruitment. Down syndrome (~ 1.3% of cases), a well-established risk factor of childhood ALL strongly associated with maternal age at birth, were and excluded from the analysis, was an exclusion criterion for selection of controls in CC studies and were excluded in the analyses [28]. Parents are usually the legal guardians whose age is reported but not necessarily the biological parents; it is rather unlikely, however, that the negligible proportion of non-biological parents, could impact on the results of the parental age association with childhood ALL [23]; indeed, available information in the nationwide Danish study shows that only 0.6% of children are adopted. Data collection and harmonization is detailed in a Supplementary Materials file.

Statistical analysis

To examine the relationship between paternal and maternal age and risk of childhood ALL, fractional polynomials were used to ascertain the best-fitting curves across the pooled dataset; additionally (Fig. 1), restricted cubic spline models were applied using meta-analysis-derived effect estimates [29]. Since linear relationships could not be improved upon (p > 0.10) for either maternal or paternal age when examined separately or concurrently (data not shown), we primarily included paternal and maternal age variables in 5-year increments. To address collinearity between the two main variables of interest, paternal and maternal age were included in alternative models one by one and simultaneously. In addition, concordant and discordant pairs of three by three parental age categories (< 25 [reference], 25–34, ≥ 35 years) were created; out of these nine cells, due to small numbers two of the discordant cells had to be collapsed in order to run meta-analyses of multiple logistic regression derived estimates of individual studies, as appropriate.

Fig. 1
figure 1

Forest plots from the meta-analyses of case control (CC, interview-based) and nested case–control (NCC, registry-based, record-linkage) studies on the association of (A) paternal and (B) maternal age (5-year increments) with childhood (0–14 years) acute lymphoblastic leukemia. Random-effect meta-analysis of maximally adjusted odds ratios from individual studies for any of the following variables that were available (< 20% missing values in the total dataset): index child’s age (categorical; < 1, 1–4 [reference], 5–9, 10–14 years), sex, ethnicity (Caucasian vs. non-Caucasian), birth weight (continuous; 500 gr increment), maternal education (categorical; low, intermediate [reference], high) pre-term birth (yes vs. no), maternal smoking during pregnancy (yes vs. no), multiple pregnancy (yes vs. no) and birth order (continuous; 1, 2, ≥ 3). Studies are presented in ascending order according to the mean maternal and paternal age. Maternal and paternal age are simultaneously introduced in all models

Two separate meta-analyses were undertaken by study design (CC, NCC) employing random-effects models; heterogeneity across studies was evaluated with the Cochran Q and I2 statistics (statistical significance set at p value < 0.10, derived from the Cochran Q test). The individual risk estimates were calculated in multiple logistic regression maximally adjusted models (variables with > 20% missing values in individual studies were excluded from the study-specific multivariate models). Conditional or unconditional analyses depended in individual study design, whereas maternal and paternal age variables were initially concurrently included in the models for these analyses. Furthermore, sensitivity analyses were undertaken by excluding one study per analysis to assess the effect of on maternal and paternal age.

Pooled multivariate logistic regression analyses using primary data of the 15 studies along with meta- analyses by parental sex were also employed. Based on the availability of covariates across individual studies, a partially (child’s age, sex, ethnicity, time period at diagnosis/recruitment, birth weight and maternal education) adjusted and a maximally (additionally controlling for maternal smoking during pregnancy, pre-term birth, birth order and multiple pregnancy) adjusted model were constructed with further analyses by study site. Breastfeeding was not included in the main models, as it was 100% missing in Denmark and Finland and 75% missing in Washington State, thus making the analysis of the NCC studies not meaningful; in an additional sensitivity analysis, we further included breastfeeding including only studies, in which this variable was available.

Subgroup meta-analyses by child’s age group (< 1, 1–5, 6–14 years), sex, time period of diagnosis/recruitment and child’s ethnicity to assess specific impacts of these variables on the reported effect were conducted only among the NCC studies unlikely to be subject to selection bias. Likewise, to assess the effect of potentially unmeasured confounding, the E-value was estimated [30], based on maximally adjusted effect estimates for categories of maternal and paternal age on the risk for childhood ALL. E-values indicate the size of the effect estimate that potentially unmeasured or uncontrolled confounding would require to totally attenuate the observed associations. Statistical analyses were conducted with SAS 9.4 version and STATA 14.1 version.

Results

Baseline characteristics

The 11 CC studies contributed data for 7919 cases and 12,942 controls, whereas the 5 NCC studies for 8801 cases and 29,690 controls. The enrollment periods at diagnosis of cases or recruitment of controls ranged within almost 50 years (1968–2015) and widely within and across studies. The distribution of study variables by case–control status and study design is presented in Table 1. The majority of subjects were of Caucasian origin, notably ~ 80% in the CC studies and ~ 60% in the NCC studies, among which the Californian investigation weighted more heavily. The distribution of maternal and paternal age at birth of the controls was highly variable across studies as shown in the Supplementary Fig. 2.

Table 1 Distributions of cases with acute lymphoblastic leukemia (ALL) and controls by study variables and study design

Meta-analysis by study design (CC = 11 and NCC = 5)

Figure 1, shows results from random effects meta-analyses on the association of parental age (5-year increments) with childhood ALL derived from separate models of CC and NCC studies. Regarding the paternal age association, similar results were observed regardless of study design (ORCC 1.05, 95% CI 1.00–1.11, I2 29%, p = 0.17 and ORNCC 1.04, 95% CI 1.01–1.07; I2: 0%, p = 0.86). The heterogeneous results for the maternal age association derived from CC studies (ORCC 0.99, 95% CI 0.91–1.07; heterogeneity I2: 64%, p = 0.002) were differed than expected and those actually derived from NCC studies (ORNCC 1.05, 95% CI 1.01–1.08; I2: 0%, p = 0.64).

The categorical meta-analyses (Supplementary Fig. 3) demonstrated similar results. These meta-analysis-derived associations also followed linear patterns, as indicated by the spline models (Fig. 2), with higher effect estimates when parental ages > 35 years were compared to those < 25 years. Similar results were also obtained when analyses were repeated introducing only the “maternal” or only the “paternal” age variable into the models (data not shown).

Fig. 2
figure 2

Curves depicting the association of (A) paternal and (B) maternal age with childhood (0–14 years) acute lymphoblastic leukemia, as derived from meta-analysis restricted cubic spline models encompassing the five registry-based case–control studies (Canada-Quebec; Denmark; Finland; US, California State, CCLRP; US, Washington State). The solid line depicts the effect estimate (Odds Ratio), whereas dash-lines correspond to 95% confidence intervals

After excluding one study at a time, the incremental effect of both paternal and maternal age on the risk for childhood ALL remained essentially the same in all analyses among the NCC studies but did not reach statistical significance after excluding the large Californian NCC study (OR for maternal age: 1.04, 95% CI 0.98–1.10; OR for paternal age: 1.05, 95% CI 0.99–1.10; Supplementary Fig. 4).

Pooled analyses and meta-analyses of 15 studies contributing primary data

Pooled analyses were also contacted for the 15 studies with primary data (Supplementary Table 3), notably all apart from the Californian NCC which contributed only effect estimates for the meta-analyses. Regarding paternal age, the linearly increasing risk of ALL was evident (5-year increment; maximally adjusted OR 1.08, 95% CI 1.04–1.11) and maximized (17%) for paternal age ≥ 35 years (OR 1.17, 95% CI 1.04–1.32). Similar patterns were found in the categorical maximally adjusted analyses as well as the partially adjusted models with higher numbers of cases and controls.

Advancing maternal age (5-year increment) was associated with a statistically significant decreased risk for childhood ALL (maximally adjusted OR 0.92, 95% CI 0.89–0.96). Further adjustment for study site, as well as alternative introduction of the maternal or paternal age variables in the models, and further adjustment for breastfeeding including only studies availing this variable, did not essentially change the results (data not shown).

The meta-analysis for all studies with primary data, i.e., except the Californian NCC (Supplementary Table 3-right panel and Supplementary Fig. 5) confirmed the increased risk for childhood ALL with advancing paternal age (OR5-year increment 1.05, 95% CI 1.02–1.09; no heterogeneity), but not with advancing maternal age (OR5-year increment 1.00, 95% CI 0.95–1.06; statistically significant heterogeneity).

Combined maternal and paternal age effects

Due to the discrepant results for the maternal age derived from CC, all further analyses were conducted only among the five NCC studies. In Table 2, we further assessed the individual and/or combined effects of maternal and paternal age at birth of children within different maternal-paternal age combinations, relative to children whose both parents aged 25–34 years at birth of the index child. The highest statistically significant OR was observed for children with both parents aged ≥ 35 years (OR 1.16, 95% CI 1.04–1.28) as contrasted to those with both parents < 25 years (OR 0.84, 95% CI 0.77–0.91), with comparison to the baseline group of 25–24 years in both instances. ORs for other age combinations did not indicate notable changes in ALL risk. There is a suggestion, however, that older paternal age across all maternal age categories was associated with increased disease risk in the offspring, whereas the same pattern is not clear for advanced maternal age across the paternal age categories.

Table 2 Meta-analysis derived odds ratios (OR) and 95% confidence intervals (95% CI) from the five registry-based case–control studies (Canada-Quebec; Denmark; Finland; US, California State, CCLRP; US, Washington State) on the association of the combined effect of maternal and paternal age at birth of the index child with childhood (0–14 years) acute lymphoblastic leukemia

The maximally adjusted effect estimates for maternal and paternal age ≥ 35 years (ORs 1.16 and 1.18, respectively) in the registry-based studies meta-analyses, corresponded to E-values of 1.59 and 1.64, respectively; the respective E-values for the low 95% confidence intervals were 1.28 and 1.24.

Subgroup analyses: age at diagnosis, sex, ethnicity, and diagnosis time period

The associations of ALL with maternal and paternal ages were most marked for children diagnosed at ages 1–5 years (Table 3). Associations with maternal age were equally present for male and female children, more marked among non-Caucasian children. Associations with paternal age were more marked among males and Caucasian children. Associations with time period of diagnosis/recruitment were all modestly increased; not all ORs were statistically significantly, albeit any increased risk with maternal age seemed to have proceeded that with paternal age timewise.

Table 3 Meta-analysisa derived odds ratios (OR) and 95% confidence intervals (95% CI) on the association of parental age at birth of the index child with childhood (0–14 years) acute lymphoblastic leukemia in sub-analyses by index child’s age group, sex, ethnicity, and time period of diagnosis/recruitment, as determined by the 5 registry-based case–control studies (Canada-Quebec; Denmark; Finland; US, California State, CCLRP; US, Washington State)

Discussion

We found a linearly increasing and statistically significant risk for childhood ALL with advanced paternal age. The same size association with advanced maternal age was evident only in the NCC studies as opposed to a decreasing risk estimate derived from both the CLIC CC and pooling analyses. There are some indications that the effect is mainly conferred by advanced paternal age possibly through different parental gender related mechanisms as implied in differential age, gender and ethnic group associations.

Reasons for the contradictory results, confined only to the maternal age association, may include selection bias resulting in non-representative distribution of controls in the CC studies as noted in the previously published German study [27] and also evident in the distribution of maternal age across several of the large size included CC studies. Indeed, the maternal age distribution of the UK study also suffered a deficit among control mothers of younger age. The mean maternal age of controls in the Californian CC study was ~ 2 years older compared to that of the NCC study (29.3 vs. 27.4 years) in the same State; no such difference (28.2 vs. 27.8 years) was noted, though, among the CC cases, which comprised a fraction of the NCC study cases, collected over a lengthier study period.

Not all CLIC case–control studies are subject to the same bias, however. For example, the Greek NARECHEM-ST maternal age distribution among controls seemed to follow the nationwide estimates. Likewise, the Italian SETIL study followed the population pattern and seemed to yield results similar to those of the previously published cohort study in the same area [31]. Unlike NCC studies, CC studies can additionally be subject to recall bias; it is considered implausible, however, that there might be differential recall in the age of both parents at index child’s birth [32].

The distribution of maternal and paternal age varied widely across CLIC studies (Fig. 1). The heavy weight towards the older age in some studies, possibly reflects socioeconomic and cultural variations in the underlying populations. This may have resulted in a deficit of variance for parental age distribution in some CC studies, such as NARECHEM-ST or SETIL, which could possibly explain the null maternal age associations noticed in these two studies. Temporal variations of the age distributions within individual CLIC studies reflected dramatic increases in parental age at first delivery in the recent decades [13, 14], they were more prominent in studies with lengthy collection periods and were, therefore, taken into account in the analyses.

The recently described E-value was used to assess unmeasured confounding among the NCC studies [30]. In order to sufficiently explain the observed effect estimates for both maternal and paternal age, an unmeasured confounder may impact the risk of childhood ALL with an effect estimate of a level of 1.6, which is considered quite high, given the magnitude of the observed associations with the perinatal factors that have already been described in the literature.

Whether solely advanced paternal, solely advanced maternal age or both contributed to the observed positive association with ALL (Pearson coefficient: ~ 71%), is difficult to tease out as the results remained nearly the same in all analyses. Moreover, the numbers in the extreme parental age discordant cells were rather limited to allow firm conclusions on the seemingly higher contribution of the advanced paternal compared to the maternal age on ALL risk. Lastly, information on genetic markers and maternal risk factors such as alcohol consumption [33] or maternal diabetes [34] was not currently contributed by the majority of studies to further enlighten underlying pathophysiological mechanisms. Similarly, information on breastfeeding, a proposed protective factor against childhood ALL [35] was actually missing in 3 out of 5 NCC studies, thus precluding meaningful analyses; nevertheless, sensitivity analyses restricted to studies availing this information showed similar results.

The sub-analyses revealed a more marked effect of both paternal and maternal age in the age group 1–5 years. This might be expected given that infant leukemia (< 1 year at diagnosis) is characterized by distinct clinical and cytogenetic features and is assumed to have a distinct etiology compared to leukemia in older children [36, 37], whereas in older children the potential effect of perinatal factors on leukemogenesis might be attenuated. Furthermore, we found the effect of paternal age to be rather confined to males. Although gender differences and a higher susceptibility of males to childhood leukemia have previously been described [1], this finding requires further investigation.

Several outcomes, including chromosomal abnormalities [16, 17], neurodevelopmental disorders [38, 39], psychiatric diseases or conditions [40, 41] and cancer [25] in the offspring have been associated with older parental age. Indeed, accumulation of de novo genetic mutations in the germ cells of older fathers [18, 19] could increase childhood cancer risk in the offspring [42, 43]. Related to older maternal age was a DNA methylation processed in the offspring and correlated with cancer as shown in an epigenome-wide association study [20]. Moreover, the well-established association of older maternal age with chromosomal abnormalities and birth defects [44, 45] as well as ALL [46,47,48] could possibly mediate the observed effect. Of note, in the current study, children with Down syndrome, which are more likely to develop the disease were excluded. Lastly, we have also shown in previous publications of CLIC studies that cesarean delivery, and specifically elective and not emergency cesarean section, which is more likely among children born to older mothers and consequently fathers, is associated with childhood ALL [49, 50].

The sharp increase of advanced parental age at childbearing worldwide during the last decades seems to have attracted scientific interest due to its public health implications [13, 14]. Following previous studies investigating whether the temporal increase in childhood ALL rates in the developed countries [1, 51,52,53] could be partially attributed to advanced paternal age patterns [25, 31], several CLIC studies participating in the current analyses have published individual data since the expression of interest and the meta-analysis on published studies [23, 24, 26, 27].

The strengths of the present study include access to large numbers of primary case and control data and most requested covariates along with availability of two efficient study designs that allowed to explore robustness of the observed associations in several sub-analyses testing a hypothesis on the etiology of a rare disease [54]. Limitations of the study include the divergent data collection methods for cases and controls; the lengthy and variable, by individual study, periods of data collection, by individual studies, for the main variables of interest which showed increasing trends over time within each individual study; the high levels of missing values in several essential covariates, which led to considerable decrease of the efficient sample size in the maximally adjusted analyses and the efforts to disentangle the collinear paternal and maternal age and possibly led to heterogeneous results in some instances. Lastly, the missing ALL immune-phenotype and cytogenetic data on the part of the majority of the studies, especially the NCC studies precluded further analyses.

In conclusion, this is the largest study to-date using primary data aiming to further explore the association of parental age at birth with childhood ALL. Our results confirm those from the meta-analysis on published studies and more recent reports demonstrating that advanced parental age is associated with increased disease risk and showed that the associations are mostly marked in the age group 1–5 years. It is possible that advanced parental age confers the effect through different parental gender related mechanisms as indicated by the differential parental gender by age, gender and ethnic group of the index child associations. Indeed, de novo genetic mutations in the fathers’ germ cells and epigenetic alterations in the offspring born to older mothers could explain the observed associations. Subtype analysis on cytogenetic characterizations and immunophenotype could further refine our understanding on the mechanisms through which advanced parental age is implicated in leukemogenesis among children.