Abstract
Arsenic is a bladder carcinogen though less is known regarding the specific temporal relationship between exposure and bladder cancer diagnosis. In this study, we modeled time-varying mixtures of arsenic exposures at many historic temporal windows to evaluate their association with bladder cancer risk in the New England Bladder Cancer Study. We used arsenic exposure estimates up to 60 years prior to study entry and compared the goodness of fit of models using these mixtures to those using summary measures of arsenic exposures. We used the Bayesian index low rank kriging multiple membership model (LRK-MMM) to estimate the associations of these mixtures with bladder cancer and estimate cumulative spatial risk for bladder cancer using participants’ residential histories. We found consistent evidence that modeling arsenic exposures as a time-varying mixture provided better fit to the data than using a single arsenic exposure summary measure. We estimated several positive though not significant associations of the time-varying arsenic mixtures with bladder cancer having odds ratios (ORs) of 1.03–1.14 and identified many significant and positive associations for an interaction among those who consumed water from a private dug well (ORs 1.28–1.60). Arsenic exposures 40–50 years before study entry received elevated importance weights in these mixtures. Additionally, we found two small areas of elevated cumulative spatial risk for bladder cancer in southern New Hampshire and in south central Maine. These results emphasize the importance of considering time-varying mixtures of exposures for diseases with long latencies such as bladder cancer.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Bladder cancer is a relatively common cancer and particularly so in industrialized nations [1]. It is expected to be the fourth-leading cancer in terms of incidence and eighth in terms of mortality among males in the United States in 2023 [2]. While the primary risk factor for bladder cancer is tobacco smoking [3], environmental and occupational exposures are other important risk factors, the latter of which accounts for 20–25% of all bladder cancer cases in men [4]. One notable environmental exposure for bladder cancer is arsenic, a naturally occurring metalloid that has been established as a bladder carcinogen by the International Agency for Research on Cancer [5]. Humans are exposed to arsenic primarily through drinking water in addition to exposure from contaminated soil and air.
Several studies have examined the arsenic-bladder cancer relationship across a variety of settings and study designs. For example, a large prospective cohort study in Chile identified significant relative risks for bladder cancer among residents of a region whose drinking water had high levels of arsenic [6]. Additionally, the New England Bladder Cancer Study (NEBCS) was a population-based case–control study conducted in Maine, New Hampshire and Vermont that was designed to identify the reasons for the elevated bladder cancer mortality rates in northern New England since at least the mid-twentieth century [7]. Arsenic was a natural candidate risk factor to investigate in this region, as its geology leads to low-to-moderate levels of arsenic in well water and a large proportion of the population use private wells as their primary source of drinking water. The NEBCS found a statistically significant trend in risk occurring with increasing drinking water intake from all sources and among participants with exclusive use of shallow dug wells before 1960 [8]. Further, lagging exposure 40 years led to stronger estimated associations with bladder cancer for cumulative arsenic exposure (average arsenic concentration multiplied by daily intake multiplied by duration of exposure). This long latency period was similar to that estimated in the Chile study [6], suggesting that evaluating arsenic exposures at different times may be informative for understanding its temporal relationship with bladder cancer.
Two classes of statistical techniques—mixture analysis and spatial analysis—can address this accumulation of exposures over space and time. Mixture analysis has emerged as a powerful technique to assess the effects of mixtures of exposures on health outcomes. This class of methods evaluates how the totality of a mixture, often comprised multiple chemicals, may be associated with an outcome, in contrast to analyzing the chemicals separately. This is in line with the exposome concept [9, 10], which holds that health outcomes result from a variety of exposure sources accumulating over the individual life course. Five methods for mixture analysis that have been developed in recent years are weighted quantile sum (WQS) regression [11, 12], Bayesian kernel machine regression (BKMR) models [13, 14], quantile g-computation [15], functional logistic regression [16], and the Bayesian group index model [17]. While these methods vary regarding implementation and estimation details, they share a goal of estimating the health effect of a mixture of environmental exposures as well as how each exposure varies in its contribution to the mixture effect. Additionally, mixture analysis can consider the timing of mixture exposures in relation to health outcomes. Recent applications of WQS and the Bayesian index model have assessed how exposure to mixtures of metals at different times is associated with rapid visual processing in children [18] and how exposure to neighborhood deprivation at historic time lags is associated with risk for non-Hodgkin lymphoma [19], respectively. These methods have extended distributed lag models [20, 21] to identify critical temporal exposure windows for mixtures in contrast to single chemicals. Identifying important temporal windows from time-varying mixtures provides additional information regarding how exposures influence health over time.
Statistical analyses of diseases with long latencies such as bladder cancer have begun to use residential histories to assess how cumulative spatial exposure to potentially unmeasured risk factors may also associate with disease risk. This approach embodies the exposome framework as well by considering accumulating unmeasured exposures for which residential histories are proxies. For example, the convolution multiple membership model (MMM) aggregates study participants’ residential histories to larger administrative units and has been applied to model cumulative spatial risk for mesothelioma [22]. Also, the low-rank kriging (LRK) MMM uses the precise points in space from residential histories to estimate cumulative spatial risk and has been applied to case–control analyses of bladder cancer [23] and non-Hodgkin lymphoma (NHL) [24, 25]. The applications of this model to the NHL data also included weighted indices of environmental exposures to estimate mixture effects, the importance of each component in the mixture, and cumulative spatial risk simultaneously in a model termed the Bayesian index LRK-MMM. These models can identify geographic areas where residence is associated with elevated risk for disease and can motivate follow-up investigations into the causes of such risk, such as previously unmeasured environmental exposures.
Following the above developments in mixture and spatial modeling, we conducted an analysis in the NEBCS using historic arsenic exposure measurements and residential histories to estimate the associations between arsenic and bladder cancer risk over time. We used the Bayesian index LRK-MMM owing to its ability to flexibly estimate mixture effects and importance weights for components in the mixture and also estimate cumulative spatial risk for disease with residential histories. A primary goal of our analysis was to assess whether modeling arsenic exposures as a time-varying mixture provided better model fit to NEBCS data rather than doing so with a similar summary measure [8]. Our hypothesis was that modeling time-varying arsenic exposures as a weighted index would provide better model fit. We considered many different annual arsenic exposures up to a 60-year period prior to study entry, which would provide insight regarding the importance of arsenic exposures at different times for bladder cancer risk. We also estimated the residual cumulative spatial risk surface to evaluate whether any areas in the study region conferred significantly elevated cumulative spatial risk for bladder cancer that was unexplained by arsenic exposures or other covariates.
2 Methods
2.1 Study Population
The New England Bladder Cancer Study (NEBCS) is a population-based case–control study in Maine, New Hampshire, and Vermont. Rates of bladder cancer incidence and mortality have been elevated in this region for decades, and the NEBCS sought to identify risk factors responsible for the elevated incidence. The details of this study have been described previously [8, 26]. Briefly, cases were all newly diagnosed cases of bladder between 2001 and 2004 cancer among residents of the study region, and controls were randomly selected from driver’s license registration (for controls under 65 years) or Centers for Medicare and Medicaid Services (for controls 65–79 years) records. Controls were frequency matched to cases by state, sex, and approximate age at diagnosis. There was a 65% participation rate in the study among both cases and controls. Residential histories were collected from participants based on in-person interviews with each subject that used a standardized questionnaire. In our analysis, we used the residential histories of long-term residents who had been living in Maine, New Hampshire, or Vermont for a minimum of 25 years prior to diagnosis (reference date for controls) (500 cases and 602 controls, which was 43% of the interviewed and eligible totals for both cases and controls), and considered residential locations for subjects from 1970 to 1986, to reflect a latency period that has been estimated for bladder cancer occupational carcinogens [27]. We made this restriction to focus on historic exposures within the study region. We adjusted our models for smoking status (former, current/occasional, don’t know, versus reference never), high-risk occupation [28] (ever/never), sex, age group (< 55 or 55–64 or 65–74 or 75 +), race, French-Canadian ancestry, ethnicity (Hispanic or don’t know versus not Hispanic), educational attainment (high school degree, vocational or some college, college degree, postgraduate versus less than high school degree), cumulative total trihalomethanes (THM) intake from age 15 + , average daily nitrate intake from public water supplies and private wells in 1970 or later, and drinking from a shallow dug well in the study area before 1960. Arsenic exposures were estimated at participant residential locations using drinking water sampling and statistical modeling, the details of which have been described previously [29]. In Table 1 we summarize the characteristics of the analysis population, as well as of a subset that had complete arsenic exposure measurements for the 60 years prior to study entry. This involved first excluding 85 cases and 95 controls who were less than 60 years old and then excluding 279 cases and 350 controls without complete estimated arsenic exposures over this time period, meaning that 33% of cases and 31% of controls who were 60 years or older had complete arsenic exposure histories. Overall, this sub-sample comprised 27% of case long-term residents and 26% of control long-term residents.
2.2 Model Specification
We considered three classes of models to assess the associations of arsenic exposure and cumulative unmeasured exposures with bladder cancer risk using the Bayesian index low-rank kriging multiple membership model [24] (LRK-MMM), a type of hierarchical Bayesian regression model. We chose to use this model due to its ability to accommodate measured mixtures and cumulative spatial risk using residential histories. The three classes of models we used varied in their specification and timing of arsenic exposures and are given below. For all models we assumed a binary outcome variable \({Y}_{i}\) denoting case membership that was distributed as a Bernoulli random variable with probability of being a case \({p}_{i}\), and we modeled the log-odds of the probability of being a case.
In Class 0,
In Class 1,
In Class 2,
Several components are common to models. Namely, \({\upbeta }_{0}\) is an intercept term, and we control for covariate vector \({x}_{i}\) with coefficient vector \(\theta\). Additionally, we estimate cumulative spatial risk with the right-most term in all models. For the ith participant, define the set of all locations they have lived in their residential history to be \({A}_{\left(i\right)}\). For jth location \({s}_{ij}\), the spatial risk they experience at that location is estimated with a sum of spatial random effects \(\uppsi\) weighted by the spatial covariance \(C\left[\cdot \right]\) between the residential location and the knot locations where the spatial random effects are evaluated (\({k}_{m},m=1,\dots ,{n}_{K}\)). We use the Matern covariance function that simplifies to \(C\left[d\right]=\left(1+d/\uprho \right){e}^{-d/\uprho }\) for distance \(d\) and spatial range parameter \(\uprho\) when fixing parameters of the Matern family of \(m\) and \(\upnu\) to \(1\) and \(3/2\), respectively. We choose to use the Matern family of covariance functions in our models due to its popularity in geostatistical models [30, 31], flexibility, and smoothness with respect to distance. Finally, the term \({w}_{ij}\) represents the proportion of the study period that the ith participant resided at their jth location, so \({\sum }_{j=1}^{J}{w}_{ij}=1\).
The specification of the arsenic term is different in each class of models. In Class 0, we estimate the association of a summary arsenic exposure (cumulative arsenic intake from residential and workplace water lagged 40 years (meaning a sum of these arsenic exposures ending 40 years before study entry), or average arsenic concentration from residential and workplace water) with bladder cancer risk using regression coefficient \({\upbeta }_{1}\). In Class 1, we replace the summary arsenic exposure with a time-varying mixture of arsenic exposures. Over a time period of T years, the quantized arsenic exposure for the ith individual in the tth year is \({q}_{it},t=1,\dots ,T.\) We quantized the arsenic exposures to accommodate high correlations between arsenic measurements over time and to account for uncertainty in the measurements, using quartiles (four groups) in all models. The importance weight for the tth year in the mixture was \({\upomega }_{t}\), and \({\sum }_{t=1}^{T}{\upomega }_{t}=1\) for interpretability. We used a variety of windows of temporal exposure for the arsenic mixture (Table 2). In Class 2, we used the same set of temporal windows as in Class 1 but added an additional interaction term (\({\upbeta }_{2}*{{\text{Dug Well}}}_{i}\)) to evaluate how the health effect of the arsenic mixtures could vary for study participants who had consumed water from shallow dug wells. We included this class of models due to a previous finding that water intake was significantly associated with bladder cancer risk among study participants who drank from shallow dug wells prior to 1960 [8]. Exponentiating the regression coefficients \({\upbeta }_{j},j=\mathrm{1,2}\) gives the odds ratio for bladder cancer for the arsenic exposures.
Some study participants had missing arsenic exposure estimates at certain time lags, and increasingly so many decades prior to study entry. In the case of missing arsenic exposures, we imputed missing values from a log-normal distribution using the non-missing logged estimated arsenic measurements in the given year. In a sensitivity analysis, we fit all models described above to a subset of the analysis sample who had complete arsenic exposure estimates for the 60 years prior to study entry. We conducted the sensitivity analysis to evaluate the impact of imputing missing arsenic measurements on model estimated odds ratios for the arsenic mixtures and on the importance of historic arsenic exposures.
2.3 Knot Selection
The Bayesian index LRK-MMM reduces the dimensionality of the spatial risk model component through a set of \({n}_{K}\) knots, which are locations where the spatial random effects are estimated. This requires specification of the number and location of the knots. Previous research has suggested that the Teitz and Bart heuristic [32], originally developed to address the location-allocation problem, chooses knot locations in a way that enables good sensitivity and power to detect regions of elevated spatial risk for disease [33]. Therefore, we use this heuristic to choose knot locations in our models. Briefly, Teitz and Bart begins with a random set of knot locations and iteratively changes knot location points to candidate ones if doing so decreases the objective function of the total distance between cases and their nearest knot location. Additionally, we use \({n}_{K}=60\) knots in all models for comparability with a previous LRK-MMM analysis of NEBCS data [23].
2.4 Model Fitting and Evaluation
We fitted models in a Bayesian framework using Markov chain Monte Carlo (MCMC) methods. We specified the following priors: for the adjustment covariates, \({\uptheta }_{b}\sim Normal\left(0,{\uptau }_{b}\right)\), where \({\uptau }_{b}=1/{\upsigma }_{b}^{2}\) and \({\upsigma }_{b}\sim Uniform\left(\mathrm{0,100}\right)\). The intercept and arsenic exposure coefficients received a similar prior \({\upbeta }_{j}\sim Normal\left(0,{\uptau }_{j}\right),\) with \({\uptau }_{j}=1/{\upsigma }_{j}^{2}\) and \({\upsigma }_{j}\sim Uniform\left(\mathrm{0,100}\right),j=\mathrm{0,1},2\). The arsenic index importance weight vector \({\varvec{\upomega}}\) received a Dirichlet prior with parameter vector \({\varvec{\upalpha}}=\left({\mathrm{\alpha }}_{1},\dots ,{\mathrm{\alpha }}_{T}\right)\) to assure that each weight in the index was between 0 and 1 and \({\sum }_{t=1}^{T}{\upomega }_{t}=1\). The spatial random effect vector \(\uppsi\) received a multivariate normal prior \(MVN\left(0,{\uptau }_{S}{{\varvec{\Omega}}}^{-1}\right)\), with covariance matrix \({\varvec{\Omega}}=\left[C\left[\left|{k}_{a}-{k}_{b}\right|/\uprho \right]\right]\), \({\uptau }_{S}=1/{\upsigma }_{S}^{2},\) and \({\upsigma }_{S}\sim Uniform\left(\mathrm{0,100}\right)\). Finally, the spatial range parameter received a uniform prior on the range \(\left(0,{d}_{max}\right)\), where \({d}_{max}\) represents the maximum distance between a knot location and a residential location.
For model estimation, we used Just Another Gibbs Sampler (JAGS) [34], in the software R, version 4.1.0 [35], using two chains that each had a burn in period of 60,000 iterations and retained 40,000 iterations for sampling from the joint posterior distribution. We monitored convergence of model parameters using the Gelman-Rubin statistic [36], where a parameter was considered to have converged if its statistic was less than 1.1, using the coda package [37] in R. We compared model goodness of fit using the deviance information criterion (DIC), which is a common method to compare model fit that penalizes model complexity [38]. Smaller DIC values indicate a better fit to the data, and differences of greater than 5 in DIC may indicate meaningfully better fitting models [39]. We summarized associations with the arsenic exposures with posterior mean and 95% credible interval for the odds ratio.
We assessed spatial risk over a 6 km by 6 km grid covering the study region, predicting spatial risk at each grid cell with the posterior estimates of the spatial random effects at the knot locations and the covariance function between the grid cell and the knot locations. We identified grid cells as being significantly elevated or lowered in risk using exceedance probabilities [40], which estimate how frequently the spatial odds at a location (\({\uptheta }_{i}\)) exceed the null value (\({\uptheta }_{i}=1\)). The exceedance probabilities use the posterior distribution of spatial odds at the ith location (\({\uptheta }_{i,m+1},\dots ,{\uptheta }_{i,m+G}\)), where m represents the burn-in and G represents the number of posterior samples after the burn-in, and are calculated as \(\hat{q}_{i} = {1 \mathord{\left/ {\vphantom {1 G}} \right. \kern-0pt} G}\sum\nolimits_{g = m + 1}^{m + G} I \left( {{\uptheta }_{i,g} > 1} \right)\). We determined significance of spatial risk using 95% exceedance probabilities.
3 Results
According to the DIC values, modeling the effect of arsenic exposure as time-varying rather than as a single summary measure consistently provided better model fit to the data (Table 3). Where the DIC values for the Class 0 models with summary arsenic measures were 1439 and 1440, the DIC values for the Class 1 time-varying arsenic exposure index models were between 1431 and 1436. Therefore, 14 of the 20 Class 1 models provided considerably better model fit than the Class 0 models. Including the additional interaction term for the dug well drinkers in the Class 2 models provided even better model fit. The DIC values for models in Class 2 ranged from 1424 to 1434, meaning that 19 of the 20 models fit considerably better than the Class 0 models.
Comparing the estimated odds ratios for the arsenic exposure index provides additional insight into the health effect of arsenic exposure at different times on bladder cancer risk. In the Class 1 models, the odds ratio for the arsenic exposure index ranged from 1.03 to 1.14 (Fig. 1). The four best fitting models in this class included arsenic exposures up to 30 years (one model), 45 years (one model), and 60 years (two models) prior to study entry. In the Class 2 models, the odds ratio for the arsenic index ranged from 1.28 to 1.60 among dug well drinkers and was significantly elevated for 16 of the 20 models (Fig. 2). These results provide consistent evidence for the association between historic time-varying arsenic exposures and bladder cancer risk among those who drank water from shallow dug wells. The six best fitting models in this class included four models with arsenic exposures up to 60 years before study entry and two models with arsenic exposures in the 15 years prior to study entry.
Evaluating the estimated importance weights in different indices supports the variation in the importance of arsenic exposures at different times for bladder cancer risk (Fig. 3). In the Class 1 models, there were several elevated importance weights at time lags between 60 and 45 years before study entry for several models, and in three of the four models the largest estimated importance weight corresponded to arsenic exposure at least 45 years prior to study entry. For the Class 2 models, there were many elevated importance weights between 60 and 45 years prior to study entry as well as a few elevated importance weights in the 10 years before study entry.
The sensitivity analysis found that results changed little when including only study participants having 60 years of arsenic exposures. Relative to the best fitting Class 0 models, the Class 1 models provided improvements in DIC of 10–22, and the Class 2 models provided improvements of 14–33 (Supplemental Material Table S1). In the Class 1 models, the estimated odds ratio for the arsenic exposure index was close to the null and not significant for any model (Supplemental Material Fig. S1). In the Class 2 models, the estimated odds ratio for the arsenic exposure index among dug well drinkers was elevated for all models and significant for 16 of 20 models (Supplemental Material Figure S2). Therefore, the sensitivity analysis suggests that imputing missing arsenic exposures did not bias conclusions in the main analysis.
After modeling the time-varying arsenic exposure mixtures and adjusting for individual-level covariates, there was little additional variation in the cumulative spatial risk surface. In the main analysis, nowhere in the study region constituted a clustering of significantly elevated cumulative spatial risk. In the Class 1 models from the sensitivity analysis, three models identified a small circular region of elevated risk in southern New Hampshire near Rochester that had approximate radius 15 km, and one of these models identified a region of similar size in south central Maine near Augusta (Fig. 4). In the Class 2 models from the sensitivity analysis, five models identified the region in southern New Hampshire, and four of these models identified the region in south central Maine (Fig. 5).
4 Discussion/Conclusion
In this study, we estimated the associations between mixtures of historic arsenic exposures and bladder cancer risk in the NEBCS. Our analytical approach used the Bayesian index LRK-MMM, which simultaneously estimates cumulative spatial risk using residential histories and mixture effects. We compared the goodness of fit of these models to those that used a 40-year lagged cumulative measure of arsenic exposure in addition to estimating cumulative spatial risk (standard LRK-MMMs). We found that modeling arsenic exposures as a time-varying mixture consistently provided better goodness of fit than using the summary measure. In these time-varying mixtures, we estimated many positive though not significant associations with bladder cancer risk. However, an interaction term for the arsenic exposure mixture among dug well drinkers was positively and significantly associated with bladder cancer risk in almost every model fit, which provides insight into how arsenic exposures affect cancer risk differentially over time and by source of drinking water. In these models, arsenic exposures between 45 and 60 years before study entry received large importance weights in the mixture, and a few models estimated larger importance weights for more recent exposures. These results communicate the complexity and time-varying nature of arsenic exposures for bladder cancer risk. Additionally, modeling time-varying arsenic exposures explained one area of elevated spatial risk for bladder cancer identified from a previous analysis that used LRK-MMMs and a summary measure of arsenic exposure [23]. We found that our results changed little when restricting the analysis sample to include only residents with complete arsenic exposure histories in contrast to the imputation we performed in the main analysis. We also identified small areas of elevated spatial risk for bladder cancer in the sensitivity that could warrant additional hypothesis generation and analyses.
Findings in our study contribute to the literature on arsenic and bladder cancer. Though arsenic has been recognized as a carcinogen for bladder cancer [5], less is known regarding the exact timing of the relationship between exposure and diagnosis. A previous analysis of arsenic in the NEBCS found that lagging arsenic exposure 40 years provided stronger and significant associations with bladder cancer risk compared with shorter exposure lags [8]. This latency was approximately equal to that found in the prospective study in Chile that identified significantly elevated risks for bladder cancer risks up to 40 years after exposure reduction [6, 41]. One notable difference between the sample in the Chile study and the NEBCS is that many residents in the former study were exposed to very high levels of arsenic in the earliest decades of the exposure range (1950s to 1970s). Contrastingly, most participants in the NEBCS were exposed to low-to-moderate levels of arsenic that were lower than for many participants in the Chile study. The large estimated importance weights in many of our models for arsenic exposures occurring 45–50 years before study entry (Fig. 3) suggest a somewhat longer latency period for these more moderate exposures.
There are several strengths to our study. First is its use of complete residential histories and historic arsenic exposure estimates [29] that enable assessment of how historic unmeasured and estimated exposures respectively influence risk for bladder cancer, a disease with a long latency period. Modeling exposure over the life course requires [42, 43] a large effort. Our results illustrate the importance of estimating these historic exposures. Additionally, our analytical framework employed the Bayesian index LRK-MMM, which developed from the Bayesian index model and has demonstrated the ability to accurately and consistently estimate mixture effects as well as high power to detect regions of elevated unmeasured spatial risk for disease [17, 24]. Finally, we considered a wide range of exposure windows for the time-varying arsenic mixtures to assess how exposures at different times vary in importance for explaining bladder cancer risk. The methods we used in this analysis can motivate future studies that model the associations of time-varying exposures with disease risk.
The limitations of our study should also be considered. First, although the historic arsenic exposure measurements were a product of extensive water sampling, characterization of the geology of the aquifers, and statistical modeling, there is a possibility that the arsenic estimates may not have represented the true exposures of some participants, and particularly for certain historical residential locations. For a subset of drilled wells, this method of arsenic exposure assignment offered only moderate agreement, sensitivity, and specificity for a binary classification of less or greater than 2 μg/L of arsenic [29]. Thus, it is possible that the variation in the arsenic exposure estimates could have led to non-differential exposure misclassification and a subsequent underestimation of true risk [44]. Second, the prevalence of use of arsenical pesticides and dug wells decreased from the mid-twentieth century to the present [8, 45]. Therefore, these factors may be less relevant for future studies and populations. Third, it is possible that some other water contaminants demonstrated different time-varying associations and strengths of association with bladder cancer risk. A previous analysis of data in this study found that high exposures to drinking water nitrates (average concentration above the 95th percentile) was associated with elevated risk for bladder cancer [46]. It is possible that exposures at different times to this contaminant could be important for bladder cancer risk. Fourth, nonparticipation occurred in this study, with a 65% response rate for both case and control subjects. Therefore, it is possible that selection bias could have impacted our findings. However, at least regarding rurality and well use, this does not appear to have been an issue in this study. For both cases and controls there were a similar proportion of participants and nonparticipants living outside a designated census place, which is a common indicator for having a private well [8], so nonparticipation does not appear to have influenced the composition of the sample in regards to this important factor for bladder cancer in this region. Additionally, while this is not necessarily a limitation, we note that we used one statistical method—the Bayesian index LRK-MMM—given the nature of our data and goal to estimate mixture effects and spatial risk simultaneously. It is possible that other statistical methods could provide new insights into arsenic exposures and bladder cancer. Finally, we are unable to establish causality due to the retrospective nature of the study, and despite adjusting for many demographic, occupational, and environmental covariates, we cannot rule out the possibility of residual confounding.
In conclusion, we found that modeling historic arsenic exposures as a time-varying mixture provided better fit to the data than modeling a single summary lagged exposure in a population-based case–control study of bladder cancer in New England. We estimated significant and positive associations for arsenic mixtures and bladder cancer among dug well drinkers across several historic windows of exposure. We found evidence for the importance of arsenic exposures 40–50 years before study entry in these time window mixtures. Our results support the importance of historic environmental exposures particularly in the context of diseases with long latency periods. Although the prevalence of dug wells and arsenical pesticides have decreased over time, our results provide additional evidence for the latency period between arsenic and bladder cancer and motivate the use of time-varying mixtures in future health research. Future studies should continue to investigate how geospatial analysis and exposure assignment can benefit health research, particularly over historic time periods.
Data Availability
The data that support the findings of this study are available on request from the author Dr. Debra Silverman. The data are not publicly available due to privacy or ethical restrictions.
References
Cumberbatch MGK, Jubber I, Black PC et al (2018) Epidemiology of bladder cancer: a systematic review and contemporary update of risk factors in 2018. Eur Urol 74(6):784–795
Siegel RL, Miller KD, Wagle NS, Jemal A. Cancer statistics, 2023. CA: a cancer journal for clinicians. 2023;73(1):17–48.
Freedman ND, Silverman DT, Hollenbeck AR, Schatzkin A, Abnet CC (2011) Association between smoking and risk of bladder cancer among men and women. JAMA 306(7):737–745
Thun M, Linet MS, Cerhan JR, Haiman CA, Schottenfeld D (eds) (2017) Cancer epidemiology and prevention. Oxford University Press
IARC (2023) Agents classified by the IARC monographs, vol 1–133. International agency for research on cancer, May 5. https://monographs.iarc.who.int/agents-classified-by-the-iarc/. Accessed 5 July 2023
Marshall G, Ferreccio C, Yuan Y et al (2007) Fifty-year study of lung and bladder cancer mortality in Chile related to arsenic in drinking water. J Natl Cancer Inst 99(12):920–928
National Cancer Institute. Cancer Mortality Maps. http://ratecalc.cancer.gov
Baris D, Waddell R, Beane Freeman LE et al (2016) Elevated bladder cancer in Northern New England: the role of drinking water and arsenic. J Natl Cancer Inst. https://doi.org/10.1093/jnci/djw099
Wild CP (2005) Complementing the genome with an “exposome”: the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol Prev Biomarkers 14(8):1847–1850
Wild CP (2012) The exposome: from concept to utility. Int J Epidemiol 41(1):24–32
Carrico C, Gennings C, Wheeler DC, Factor-Litvak P (2015) Characterization of weighted quantile sum regression for highly correlated data in a risk analysis setting. J Agric Biol Environ Stat 20(1):100–120. https://doi.org/10.1007/s13253-014-0180-3
Czarnota J, Gennings C, Wheeler DC (2015) Assessment of weighted quantile sum regression for modeling chemical mixtures and cancer risk. Cancer Inf 14:CIN.S17295
Liu SH (2016) Statistical methods for estimating the effects of multi-pollutant exposures in children’s health research. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences
Liu SH, Bobb JF, Lee KH et al (2018) Lagged kernel machine regression for identifying time windows of susceptibility to exposures of complex mixtures. Biostatistics 19(3):325–341
Keil AP, Buckley JP, O’Brien KM, Ferguson KK, Zhao S, White AJ (2020) A quantile-based g-computation approach to addressing the effects of exposure mixtures. Environ Health Perspect 128(4):47004
Wei P, Tang H, Li D (2014) Functional logistic regression approach to detecting gene by longitudinal environmental exposure interaction in a case-control study. Genet Epidemiol 38(7):638–651
Wheeler DC, Rustom S, Carli M, Whitehead TP, Ward MH, Metayer C (2021) Bayesian group index regression for modeling chemical mixtures and cancer risk. Int J Environ Res Public Health 18(7):3486
Levin-Schwartz Y, Gennings C, Schnaas L et al (2019) Time-varying associations between prenatal metal mixtures and rapid visual processing in children. Environ Health 18(1):1–12
Boyle J, Ward MH, Cerhan JR, Rothman N, Wheeler DC (2023) Modeling historic neighborhood deprivation and non-Hodgkin lymphoma risk. Environm Res (under review). Published online.
Wang Q, Benmarhnia T, Zhang H et al (2018) Identifying windows of susceptibility for maternal exposure to ambient air pollution and preterm birth. Environ Int 121:317–324
Darrow LA, Klein M, Strickland MJ, Mulholland JA, Tolbert PE (2011) Ambient air pollution and birth weight in full-term infants in Atlanta, 1994–2004. Environ Health Perspect 119(5):731–737. https://doi.org/10.1289/ehp.1002785
Petrof O, Neyens T, Nuyts V, Nackaerts K, Nemery B, Faes C (2020) On the impact of residential history in the spatial analysis of diseases with a long latency period: a study of mesothelioma in Belgium. Stat Med 39(26):3840–3866
Boyle J, Ward MH, Koutros S et al (2022) Estimating cumulative spatial risk over time with low-rank kriging multiple membership models. Stat Med 41(23):4593–4606
Boyle J, Ward MH, Cerhan JR, Rothman N, Wheeler DC (2022) Estimating mixture effects and cumulative spatial risk over time simultaneously using a Bayesian index low-rank kriging multiple membership model. Stat Med 41(29):5679–5697
Boyle J, Ward MH, Cerhan JR, Rothman N, Wheeler DC (2023) Modeling historic environmental pollutant exposures and non-Hodgkin lymphoma risk. Environ Res 224:115506
Baris D, Karagas MR, Verrill C et al (2009) A case–control study of smoking and bladder cancer risk: emergent patterns over time. J Natl Cancer Inst 101(22):1553–1561
Miyakawa M, Tachibana M, Miyakawa A et al (2001) Re-evaluation of the latent period of bladder cancer in dyestuff-plant workers in Japan. Int J Urol 8(8):423–430
Colt JS, Karagas MR, Schwenn M et al (2011) Occupation and bladder cancer in a population-based case–control study in Northern New England. Occup Environ Med 68(4):239–249
Nuckols JR, Freeman LEB, Lubin JH et al (2011) Estimating water supply arsenic levels in the New England Bladder Cancer Study. Environ Health Perspect 119(9):1279–1285
Shaddick G, Zidek JV (2014) A case study in preferential sampling: Long term monitoring of air pollution in the UK. Spatial Statistics 9:51–65
Diggle PJ, Tawn JA, Moyeed RA (1998) Model-based geostatistics. J Roy Stat Soc: Ser C (Appl Stat) 47(3):299–350
Teitz MB, Bart P (1968) Heuristic methods for estimating the generalized vertex median of a weighted graph. Oper Res 16(5):955–961
Boyle J, Wheeler DC (2022) Knot selection for low-rank kriging models of spatial risk in case-control studies. Spatial Spatio-Temporal Epidemiol 41:100483
Plummer M et al. (2003) JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. In: Proceedings of the 3rd international workshop on distributed statistical computing, vol 124. Vienna, Austria, pp. 1–10.
R Core Team et al. (2021). R: a language and environment for statistical computing. Published online.
Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7(4):457–472
Plummer M, Best N, Cowles K, Vines K (2006) CODA: convergence diagnosis and output analysis for MCMC. R News 6(1):7–11
Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc: Ser B (Stat Methodol) 64(4):583–639
MRC (2022) DIC: deviance information criteria. University of Cambridge Biostatistics Unit. https://www.mrc-bsu.cam.ac.uk/software/bugs/the-bugs-project-dic/
Richardson S, Thomson A, Best N, Elliott P (2004) Interpreting posterior relative risk estimates in disease-mapping studies. Environ Health Perspect 112(9):1016–1025
Smith AH, Marshall G, Roh T, Ferreccio C, Liaw J, Steinmaus C (2018) Lung, bladder, and kidney cancer mortality 40 years after arsenic exposure reduction. J Natl Cancer Inst 110(3):241–249. https://doi.org/10.1093/jnci/djx201
de Vuijst E, van Ham M, Kleinhans R (2016) A life course approach to understanding neighbourhood effects. IZA Discussion paper #10276:10276.
Halfon N, Hochstein M (2002) Life course health development: an integrated framework for developing health, policy, and research. Milbank Quar 80(3):433–479
Cantor KP, Lubin JH (2007) Arsenic, internal cancers, and issues in inference from studies of low-level exposures in human populations. Toxicol Appl Pharmacol 222(3):252–257
D’Angelo D, Norton SA, Loiselle MC (1996) Historical uses and fate of arsenic in Maine. Water Research Institute, Sawyer Environmental Research Center, University
Barry KH, Jones RR, Cantor KP et al (2020) Ingested nitrate and nitrite and bladder cancer in Northern New England. Epidemiology 31(1):136–144. https://doi.org/10.1097/EDE.0000000000001112
Funding
Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award No. U01CA259376. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Corresponding author
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Boyle, J., Ward, M.H., Koutros, S. et al. Modeling Historic Arsenic Exposures and Spatial Risk for Bladder Cancer. Stat Biosci 16, 377–394 (2024). https://doi.org/10.1007/s12561-023-09404-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-023-09404-7