Introduction

CML is a myeloproliferative disease associated with the Philadelphia (Ph) chromosome abnormality resulting from the translocation of chromosomes 9 and 22 [1]. The worldwide incidence ranges from less than 1.0 to more than 2.0 per 100,000 [24]. In spite of the ranges in published incidence, no biologic, geographic or ethnic variation has been shown to predispose individuals to CML. The one documented exception is an increased incidence of leukemia including CML among survivors of the atomic bombings of Hiroshima and Nagasaki [5]. Recently, a frequency-based analysis suggested that the median age at diagnosis of CML occurred almost 2 decades earlier in developing countries [6, 7].

As age is one of the greatest risk factors for a majority of cancers, we aimed to study the relationship between geography and income on age at diagnosis and incidence. Our study utilized international cancer registry data published as part of the International Agency for Research on Cancer (IARC) publication titled Cancer Incidence in Five Continents (CI5) to assess for trends by geography and income on age at diagnosis and incidence [2].

Materials and methods

Incidence data from 196 registries of CI5, Version IX were downloaded from IARC [2]. Version IX included diagnoses made between 1998 and 2002. CML was defined according to the International Classification of Diseases for Oncology 10th edition (ICD-O-10) of the World Health Organization (C92.1, C93.1, C94.1). Multiple registries within a country were combined resulting in 60 countries. Countries were further grouped into macro-regions and components according to the United Nations World Macro Regions and Components [8]. The World Bank estimate of gross national income (GNI) per capita was utilized to make comparisons by 4 World Bank GNI analytical groupings: low income (GNI of $1,025 or less), lower-middle income (GNI between $1,026 and $4,035), upper-middle income (GNI between $4,036 and $12,475) or high income (GNI of $12,476 or more) [9]. GNI is calculated for each country using the World Bank Atlas method and converted to US dollars divided by the midyear population for the year 2000 [9].

Statistical methods

Descriptive statistics were provided for macro-geographical regions (continental), geographical sub-regions and GNI income groups. Statistics were stratified by gender due to the excess incidence of CML in males [10]. Median and interquartile range (IQR) were calculated for age at diagnosis. Age distributions were graphed using kernel density estimation to produce smoothed histograms [11]. A Gaussian kernel was used to estimate the underlying probability density function of CML conditioned on age at diagnosis [12]. Bandwidth selection utilized the plug-in formula of Sheather and Jones [13]. Age-adjusted incidence rates were calculated using the World Standard Population and expressed per 100,000 person-years with 95 % confidence intervals (CI) estimated [14]. Incidence of CML was further stratified by age; less than 50 years vs greater than or equal to 50 years because of the previously described early-onset CML [6]. Poisson regression was used to test for age-specific interactions and regional differences. Adjustment for multiple comparisons in the Poisson regression models utilized the approach of Tukey–Kramer. Spearman correlation coefficients were used to estimate the correlation between age at diagnosis, ASR and GNI. Analyses were conducted using the SAS system version 9.3 (Cary, NC).

Results

Between 1998 and 2002, 19,262 males and 14,428 females were diagnosed with CML from 196 distinct registries (Table 1). When grouped into regions, Africa and Asia had the youngest median age at diagnosis for males (47 years for both) and females (42 and 47 years, respectively) while Oceania had the highest median age at diagnosis (72 years for both males and females) (Table 1). Within Asia, the median age at diagnosis for males ranged from 47 to 52 years, within Europe median age at diagnosis ranged from 62 to 67 years for males while Black males were diagnosed with CML more than a decade earlier than White males in the USA (54.5 vs 67.0 years). Australia had the highest median age at diagnosis (72 years) for both males and females.

Table 1 Descriptive statistics of CML diagnosis from IARC Version IX (1998–2002)

The markedly different age distribution by region can be seen graphically using kernel density estimation. A shift towards an earlier age at diagnosis in South/Central America, Asia and Africa is readily apparent (Fig. 1a, b). Distribution of cases from Europe, North America and Oceania were consistently shifted to the right confirming that diagnoses among those registries are occurring at older ages. The distribution of diagnoses in South/Central America and Asia had a wide plateau between age 30 and 75 with minor peaks occurring between 30 and 40 years and 65 and 75 years. Patterns of CML did not vary within Europe (Supplementary Figures 1C, 2C) or Oceania (Supplementary Figures 1E, 2E). Diagnoses in Canadians and in Whites in the SEER registries had similar distributions while Blacks from SEER exhibited a younger age at diagnosis. Distribution of CML within Asia varied greatly where a majority of CML cases for both males and females occurred in South Asia by age 50 while Western Asia had a higher density in the older age-groups (Supplementary Figures 1B, 2B).

Fig. 1
figure 1

Distribution in age at diagnosis of CML for males and females from IARC Version IX (1998–2002) by region and macro-region. Kernel density estimation (smoothed histogram) of the age at diagnosis for males (a) and females (b)

Age-standardized incidence varied by sex and region as shown in Table 1. Incidence rates were lowest among Asian females (0.55 per 100,000, 95 % CI, 0.52–0.57). Incidence was highest among the Oceanic countries for both males and females (1.78 per 100,000; 95 % CI, 1.68–1.87 and 0.96 per 100,000; 95 % CI, 0.89–1.03, respectively). Incidence of CML varied within Asia with South Asian males exhibiting the highest incidence of CML (0.95 per 100,000, 95 % CI, 0.88–1.02) (Table 1). Interestingly, incidence of CML did not vary within Europe or within North America. Australia had the highest rates of CML for both males and females (Table 1). Incidence of CML was lower in Central America as compared to South America (Table 1).

We then investigated if the incidence of early-onset CML (defined as less than 50 years) differed by region. A significant age-specific interaction existed by region (p < 0.0001) where differences by region were more marked in the late-onset CML as shown in Fig. 2. As an example in males, the incidence of early-onset CML ranged from 0.33 per 100,000 (95 % CI, 0.23–0.42) in Africa to 0.59 per 100,000 (95 % CI, 0.52–0.67) in South/Central America while the incidence of late-onset CML ranged from 1.77 per 100,000 (95 % CI, 6.31–7.09) in Africa to 6.70 per 100,000 (95 % CI, 6.31–7.09) in Oceania. Age-specific interactions by macro-region were also observed as shown in Fig. 3a, b (p < 0.0001 for both males and females). For example, within Asia, South Asia had the highest incidence before age 50 while Western Asia had the highest incidence in the over 50 age-group. South American registries exhibited the highest incidence of early-onset CML while Australian registries exhibited the highest incidence of late-onset CML (Fig. 3a, b). Median age at diagnosis increased as the GNI increased (Table 1). Among males, the median age at diagnosis in the lower-middle income group was 42 years (IQR 27–57 years) increasing to 67 years (IQR 47–77 years) in the high-income group. Females followed a similar trend with respect to age at diagnosis. Among males, ASR increased as GNI increased with the high-income countries having an ASR of 1.15 per 100,000 (95 % CI, 1.13–1.17). The variation in incidence and income categorization was not seen among females. Registries from lower-middle income countries had a higher incidence of early-onset CML while high-income countries had a higher incidence of late-onset CML (Fig. 4).

Fig. 2
figure 2

Age-adjusted incidence stratified by age; less than 50 years vs greater than 50 years at diagnosis by region. Displayed are the age-specific incidence rates for males (a) and females (b). Bars represent the age-specific rates including 95 % confidence intervals as error bars. p values from the Poisson regression model for males are as follows: age <50 vs >50, p < 0.0001; region, p < 0.0001 and age × region interaction, p < 0.0001. p values from the Poisson regression model for females are as follows: age <50 vs >50, p < 0.0001; region, p < 0.0001 and age × region interaction, p < 0.0001

Fig. 3
figure 3

Age-adjusted incidence stratified by age; less than 50 years vs greater than 50 years at diagnosis by macro-region. Displayed are the age-specific incidence rates for males (a) and females (b) sorted by ascending age-specific incidence >50 years. Bars represent the age-specific rates including 95 % confidence intervals as error bars. p values from the Poisson regression model for males are as follows: age <50 vs >50, p < 0.0001; region, p < 0.0001 and age × region interaction, p < 0.0001. p values from the Poisson regression model for females are as follows: age <50 vs >50, p < 0.0001; region, p < 0.0001 and age × region interaction, p < 0.0001

Fig. 4
figure 4

Age-adjusted incidence stratified by age; less than 50 years vs greater than 50 years at diagnosis by income classification. a Displays the age-adjusted incidence for males diagnosed at <50 years of age. b Displays the age-adjusted incidence for males diagnosed at >50 years of age. c Displays the age-adjusted incidence for males diagnosed at <50 years of age. d Displays the age-adjusted incidence for males diagnosed at >50 years of age

The question then becomes, what is the relationship between age at diagnosis and incidence since the population age structures differed. A higher age-standardized incidence rate among males was associated with a higher mean age at diagnosis (Spearman correlation coefficient = 0.5311, p < 0.0001) although there was no relationship among females (Spearman correlation coefficient = 0.1920, p = 0.1349; Fig. 5a, b). Furthermore, a higher GNI was associated with a higher mean age at diagnosis for males (Spearman correlation coefficient = 0.4488, p = 0.0008) and females (Spearman correlation coefficient = 0.5093, p = 0.0001) (Fig. 5c, d). However, the relationship between age-standardized incidence rate and GNI was not statistically significant for males (p = 0.21) or females (p = 0.89) (data not shown).

Fig. 5
figure 5

Correlation of mean age at diagnosis, Gross National Index (GNI) and Age-Standardized Incidence Rate (ASR) from IARC Version IX (1998–2002) by sex. Mean age at diagnosis vs ASR is shown for males (a) and females (b). Mean age at diagnosis vs GNI is shown for males (c) and females (d). Dots represent each country

Discussion

Population-based data assessing geographic and income variations in the incidence of CML are limited, but warranted given relatively wide ranges in incidence and difference in age at diagnosis previously observed [13]. Through this analysis, we observed that age at diagnosis and incidence of CML varied by region and within region. For males, age at diagnosis and incidence increased as the GNI categorization increased. Furthermore, high rates of early-onset CML did not correlate with high rates of late-onset CML and resulted in an age-specific interaction. These findings are consistent with the younger age at diagnosis among patients in developing countries who were treated with imatinib for CML as part of the Glivec International Patient Access Program (GIPAP) [6, 7]. Interaction between geography and age may be an indicator for an exposure and/or etiological factors that are currently unknown. The finding of regional differences in age at diagnosis and incidence demonstrates that there exists a possible environmental factor that may be impacting the pattern of CML. In the past, agriculture and occupational exposures have been considered as potential etiological factors, but a strong association has yet to be established [1518]. A genetic component that is triggered from an environmental exposure is another hypothesis that should be further explored especially since CML is caused by the reciprocal translocation of chromosome 9 and 22 [19]. The possible influence of race/ethnicity on our observations is raised by the observation of a slightly younger age of onset in US Blacks vs. Whites, but more detailed studies need to be performed to disentangle the potential confounding of income differences in these populations. Detailed information on race/ethnicity is not available in this IARC data set other than for the US, but against a major influence of genetics are several observations, including the much greater difference in incidence and age of onset between US Blacks and sub-Saharan Africans compared to the difference between US Blacks and Whites. Furthermore, significant differences are observed among predominantly Caucasian countries, with Oceania being significantly different from Europe and E. Europe being significantly different from W. Europe. One surprising and potentially important observation is the difference in international incident rate variations between patients less than 50 years of age vs. those age 50 and older. The small variation in patients under 50 argues against a major role for genetic predisposition, whereas the much larger variation in those age 50 and above not only support a strong environmental influence but suggest an important risk factor related to affluence.

There is a paucity of data on the epidemiology of CML in developing countries. Our observation of a younger age at diagnosis is consistent with other observations from Asia with median age at diagnosis ranging from 31 to 49 years [6, 17, 2027]. A separate analysis concluded that the incidence of CML among select Asian countries was lower as compared to Western countries and tended to affect a younger population [4]. The epidemiology of CML in South and Central America is also limited but suggest that CML is diagnosed at an earlier age; consistent with data from CI5 [28, 29].

The excess risk of CML in Australia warrants further discussion. It was previously noted that the incidence of CML is higher in South Australia as compared to international estimates [30]. Incidence of CML (ICD10 C92.1) estimated by the Australian Institute of Health and Welfare in 2000 was 1.2 per 100,000 for males and 0.7 per 100,000 for females [31]. However, this is lower than what we observed in CI5 (males: 1.8 per 100,000 and females 0.97 per 100,000). Differences between the analyses can be directly related to the different coding of CML; IARC did not differentiate between Ph+/BCR-ABL+ CML from Ph−/BCR-ABL− CML, chronic myelomonocytic leukemia (CMML) or sub-acute myeloid leukemia [3]. The finding of a higher incidence of CML within Australia is consistent with the observation made by Jemal et al. that Australia had the highest age-standardized incidence for all cancers [32]. Regional and gender-specific differences within Australia warrant further research.

Our analysis is based on the collation of population-based registry data collected between 1998 and 2002. IARC has built-in mechanisms to review and provide the highest quality data possible. However, IARC does caution about comparing trends among registries that have varying levels of resources dedicated to cancer registration [2]. That being said, CI5 is considered to provide the best estimate of global trends in cancer incidence and thus is well suited to study trends in age at diagnosis and incidence. An important limitation of the CI5 data is the lack of individual level risk factor assessments. Furthermore, as noted above, only a limited number of registries have specifically categorized data by ethnicity, thereby limiting our ability to generalize to specific ethnic groups. Future analyses should consider including ethnicity and are currently in progress in the US.

Gender differences in age at diagnosis and incidence need to be better understood. It is the current paradigm that that the incidence of CML among males is higher than among females, but we observed varying rates of CML by males and females [1]. For example, GNI categorization was consistently associated with both age at diagnosis and incidence for males, but only related to age at diagnosis for females. The interrelationships between GNI, age at diagnosis and incidence should be further explored. This has important implications in elucidating exposure mechanisms and may point to a possible environmental exposure that may be related to an occupational exposure.

In conclusion, variation in age at diagnosis and incidence occurs within regions and between regions as well as by GNI. Geographical variation may be a surrogate for a possible exposure mechanism such as pesticides which may be tied into GNI as a surrogate for socioeconomic status.