Introduction

In the mid 1990s, Giovannucci and McKeown-Eyssen independently hypothesized an etiological role for insulin in colorectal carcinogenesis [1, 2]. Mechanistically, they proposed that the unifying feature of these two diseases was hyperinsulinemia. Insulin, which is chronically elevated in the pre-diabetic condition and in diabetes before pancreatic exhaustion, stimulates growth of both normal colonic and carcinoma cells [35] and perhaps, more importantly, modulates IGF-I and its binding proteins in such a way as to create a pro-mitotic environment for colonic epithelial cells [35].

The hypothesized mechanistic link involving hyperinsulinemia suggests that diabetes itself should be a risk factor for colorectal cancer. Initially, there was limited direct epidemiological evidence to support the contention that diabetes increased risk of colorectal cancer [6], although recent studies have begun to provide solid empirical foundation for this hypothesis [723]. But if the key feature of diabetes that identifies it as risk factor for colorectal cancer is in fact hyperinsulinemia, then subjects who have not yet been diagnosed with diabetes per se, but who are nonetheless insulin resistant and therefore hyperinsulinemic should also be at increased risk of colorectal cancer. Furthermore, subjects who have been diabetic for many years are more likely to have reached the point of pancreatic exhaustion and therefore might not be experiencing current hyperinsulinemia [24]. In this situation including long-term diabetics in the exposed category might result in misclassification with respect to the hypothesized biological mechanism linking diabetes and colorectal cancer.

We analyzed data from a large prospective cohort of US women to determine first whether diabetes was a risk factor for colorectal cancer in this population and second whether accounting for the time dynamics of diabetes exposure would change the risk estimates for colorectal cancer and inform our understanding of the mechanisms linking diabetes and colorectal cancer. It is our hypothesis that hyperinsulinemia is the key etiologic factor linking diabetes to colorectal cancer and as such diabetes should show a time-dependent relationship to risk of colorectal cancer.

Materials and methods

Study population

The Breast Cancer Detection Demonstration Project (BCDDP) was a breast-cancer-screening program that ran from 1973 to 1980 and enrolled 283,222 women at 29 screening centers in 27 cities across the United States. In 1979, the National Cancer Institute established a prospective cohort from among the subjects in the larger BCDDP study, as described previously [25]. Briefly, a total of 64,182 women were selected for entry into the follow-up cohort. Of that number, 61,431 women (96%) completed the baseline questionnaire (administered between 1979 and 1981) and were therefore eligible for further participation in the study. The Institutional Review Board of the National Cancer Institute approved the study, and all subjects provided written informed consent at the time of enrollment.

Participants subsequently completed a mailed questionnaire during three separate follow-up periods: 1987–1989, 1992–1995, and 1995–1998. Non-responders to the questionnaires received vigorous follow-up including repeated mailings and phone calls.

For the purposes of the current analysis, entry into the analytic cohort took place at the time when subjects were asked to report diabetes status (1987–1989) and person-time accrued through the final follow-up period (1995–1998).

We excluded from the present analysis women who did not complete the 1987–1989 follow-up questionnaire (n = 9,738), women with a diagnosis of colorectal cancer at the 1987–1989 questionnaire or earlier (n = 479), and women whose reported entry date occurred after their exit date (n = 5). We further excluded women who skipped 30 or more items on the 1987–1989 food frequency questionnaire or who reported total energy intake >3,800 or <400 kcal/day (n = 5,647). For this analysis, we also excluded all women we suspected to have type I diabetes. To do this, we considered any diagnosis of diabetes before age 25 (n = 46) to be a case of Type I diabetes and assumed all others to be Type II. After these exclusions, 45,516 women remained in the analytic cohort.

Cohort follow-up

We defined “end of study date” as the date the subject completed the 1995–1998 questionnaire, or if the subject did not complete a 1995–1998 questionnaire, as the date of last contact in the 1995–1998 follow-up period. For participants not known to be deceased and with whom we had no contact in the 1995–1998 follow-up period, we imputed an “end of study date” by estimating the date on which subjects would have completed the 1995–1998 questionnaire (using mean time intervals from the rest of the cohort) had they actually completed one. We defined exit date from the study as the earliest among “end of study date,” date of colorectal cancer diagnosis, or date of death from cause other than colorectal cancer.

In the final analytic cohort, 90.3% (41,096 women) had complete follow-up through 1995–1998, meaning their exit date corresponded to either the date of their first colorectal cancer diagnosis, the date they filled out the 1995–1998 questionnaire, or their date of death from a cause other than colorectal cancer.

Case ascertainment

We identified colorectal cancer cases from self-reports on the 1992–1995 and 1995–1998 questionnaires, from statewide cancer registries and from the National Death Index (through 1997). We obtained pathology reports for 246 (79%) of the 312 women who provided self-reports of a diagnosis of colorectal cancer. The pathology reports confirmed 233 (94%) of the cases as adenocarcinoma of the colon or rectum (ICD-9 site codes 153.0–153.4 and 153.6–153.9 for colon cancer and 154.0–154.1 for rectal cancer). Because of this high correspondence between the self-reports and medical records, we included as cases the remaining 66 self-reports of colorectal cancer without pathology reports. Women with pathology reports contradicting self-reported colorectal cancers were not included as cases, unless they also appeared in a state cancer registry. As the BCDDP cohort was designed to capture a variety of cancer endpoints (i.e., it was not limited to colorectal cancers), subjects who self reported many different medical conditions also provided medical records. In 17 such cases, despite not self reporting a colorectal cancer, the medical records were found to contain that diagnosis. These subjects were all included as cases. A search of the National Death Index identified an additional 108 individuals with death certificates, indicating a diagnosis of colorectal cancer. Finally, we used last-known place of residence for each subject to match against state cancer registries for those states whose registries consented to participate in the study (accounting for 73.5% of the analytic cohort). This procedure resulted in the identification of a further 65 colorectal cancer cases (the registries also identified many of the subjects previously indentified by self-report, medial records, or the NDI search, these 65 were in addition to the previously identified cases). Subjects residing in states with participating registries did not differ in any material way with respect to distribution of risk factors from subjects residing in states whose registries did not consent to participate. Thus, the total number of cases in the analytic cohort over the follow-up period was 489.

Exposure assessment

We determined diabetes status using the follow-up questionnaires in 1987–1989, 1993–1995, and 1995–1997. On each of these questionnaires between 93 and 96% of the cohort responded yes or no to the diabetes question. If answers to these questions were marked as “refused” or “not sure,” or if the questions were not answered (n = 1,993, 1,936, and 2,479 for the three follow-up questionnaires), we imputed a response of “no” for the diabetes variable. In a sensitivity analysis, if we excluded these subjects with answers marked as “refused” or “not sure”, the results were unchanged. If diabetes was indicated during the 1987–1989 questionnaire, but no date of onset was provided (n = 356), these cases were assumed to be type 2 diabetes.

Statistical analysis

We used Cox proportional hazards regression (PROC PHREG in SAS version 8) with age as the underlying time metric to generate rate ratios and 95% confidence intervals (CI) for type 2 diabetes as a risk factor for incident colorectal cancer. All p-values were two sided.

We estimated rate ratios in both age- and multivariable-adjusted models. In the multivariate models, we adjusted for age, weekday physical activity (units of Metabolic Equivalent Time or METs as described by Ainsworth [26]), total energy intake (kcals), alcohol (g/day), calcium (mg/day from both dietary and supplement sources), menopausal hormone therapy (ever/never), smoking (ever/never), regular multi-vitamin use (yes-fairly regular/not regular, no or not answered), education (up through high school/more than high school), race (Non-Hispanic White/Other), and use of non-steroidal anti-inflammatory drugs (NSAID) (yes/no/don’t know). NSAIDs included aspirin, ibuprofen (Advil, Motrin, Nuprin), Naprosyn, and other pain-relieving drugs, but excluded Tylenol. We used the density method to adjust the dietary calcium for energy [27]. We defined women to be users of NSAIDs if they had used these drugs at least once a week for at least 1 year.

For subjects with missing data on physical activity (13.4%), we imputed the median value from the rest of the cohort. For subjects with missing data on menopausal hormone therapy (1.1%), we assumed “no use”. For subjects with implausible data on dietary intakes (dietary or supplemental calcium >3,000 mg/day, alcohol >90 g/day, and fruit or vegetable servings >16 per day), we imputed the median value from the rest of the cohort. Less than 0.1% of the analytic cohort had values outside of the specified limits on these variables.

We used two different modeling approaches to account for the possible time dynamics of the diabetes and colorectal cancer association. First, we entered diabetes status in the model as a time-dependent variable allowing exposure status to change during the course of follow-up upon a new diagnosis of diabetes. Second we use a categorical approach to consider time since diagnosis. For this analysis, we used a set of time-dependent categorical variables. Subjects contributed person-time to each category in a time-dependent manner. In this way, subjects moved from category to category as time since diagnosis advanced with follow-up time. In this way, a woman diagnosed with diabetes during the first follow-up period would contribute time to the 0 years since diagnosis (i.e., no diagnosis) category, to the 0–4 years since diagnosis category, and possibly to the 4–8 years and even the 8–12 years since diagnosis categories. Similarly, a woman diagnosed 8 years prior to baseline would contribute person-time to the 8–12 year category, and possibly the 12-plus category.

Results

Women in the BCDDP follow-up cohort with complete dietary data were on average 61.9 years of age at baseline and contributed an average of 8.4 years of follow-up. Table 1 presents baseline characteristics of the analytic cohort according to type 2 diabetes status at the 1987–1989 questionnaire. Consistent with prior research, subjects reporting diabetes at baseline were more likely to be non-Caucasians, to have a higher BMI, and to have lower levels of educational attainment. There was little difference in dietary intakes or in NSAID use, smoking, menopausal hormone therapy use, or screening for colorectal cancer between diabetics and non-diabetics. Average age of diagnosis for those with prevalent diabetes at baseline was 59.0.

Table 1 Distribution of baseline characteristics among women in the BCDDP follow-up cohort by presence of diabetes

Results of the proportional hazards regression analyses using both age-adjusted and multivariable-adjusted models for prevalent diabetes as the primary exposure appear in Table 2. Our principal intent with this part of the analysis was to confirm that diabetes was associated with colorectal cancer and to allow comparisons with other studies using prevalent diabetes as an exposure. As expected, prevalent diabetes at baseline was associated with a significantly increased risk of incident colorectal cancer during the follow-up period (RR = 1.49, 95% CI 1.08–2.06 in multivariable-adjusted model). If we included diabetes cases diagnosed at either the 1993–1995 follow-up or at the 1995–1997 follow-up as part of the exposed group, the relative risks were attenuated, although the 95% confidence intervals still excluded 1.0 (RR = 1.36, 95% CI 1.01–1.82 for diabetes diagnosed at or before 1993–1995 follow-up; RR = 1.35, 95% CI 1.02–1.78 for diabetes diagnosed at or before 1995–1997 follow-up).

Table 2 Age and multivariable-adjusted hazard ratios for incident colorectal cancer among subjects diagnosed with diabetes at or prior to the baseline questionnaire, at or prior to the first follow-up questionnaire (1993–1995), or at or prior to the second follow-up questionnaire (1995–1997)

Out of concern that patients with colorectal cancer who were too sick to report a incident case of diabetes on one of the two follow-up questionnaires or had even died before they could complete the questionnaire might have resulted in those subjects being incorrectly classified as unexposed in the analyses using prevalent cases from the 1993–1995 and 1995–1997 questionnaires. In this scenario, there would be underestimation of risk in the diabetes group in analyses using prevalent diabetes as assessed on the follow-up questionnaires. To address this question, we did an alternative analysis in which we included only individuals who returned the 1993–1995 follow-up questionnaire and who were also still at risk of colorectal cancer at that time. We excluded subjects with prevalent diabetes at baseline and defined exposure as having an incident diagnosis of diabetes between baseline and return of the 1993–1995 questionnaire with follow-up beginning at that time. Of the 141 incident cases of colorectal cancer in this analysis, only 4 occurred in the group with incident diabetes at the 1993–1995 questionnaire, thus although the relative risk estimate was 1.13, the confidence intervals were wide (0.42–3.08).

In an analysis of colorectal cancer by subsite, we found that prevalent diabetes was associated with cancer in the distal colon and rectum (RR = 1.65, 95% CI 0.98–2.78 for prevalent diabetes at baseline and RR = 1.69, 95% CI 1.11–2.58 for prevalent diabetes at anytime during follow-up), but not with cancer in the proximal colon up to and including the splenic flexure (RR = 1.46, 95% CI 0.85–2.49 for prevalent diabetes at baseline and RR = 1.12, 95% CI 0.69–1.84 for prevalent diabetes at anytime during follow-up).

We also modeled diabetes in manner that would allow us to take into account the time dynamics of this exposure (Table 3). If we entered diabetes status in the proportional hazards regression model as a time-dependent variable, we found an even stronger association between diabetes and colorectal cancer than we did if just entered prevalent diabetes at baseline (RR = 1.60, 95% CI 1.18–2.18, in the multivariable-adjusted model). Furthermore, when we defined exposure as duration of diabetes exposure (modeled as a time-dependent variable), we found that in the first 4 years after diagnosis risk was essentially the same as in those never having had a diagnosis of diabetes. For those who had been diagnosed between 4 and 8 years, however, we observed a RR of 2.36 (95% CI 0.96–5.79). Those 8–12 years from diagnosis had a smaller change in their risk (RR = 1.71, 95 CI 0.63–4.61), and among those 12 or more years from diagnosis the risk estimate was back to essentially 1.0. In the time-dependent analyses, the results by subsite were analogous to those for prevalent diabetes. We observed slightly stronger and statistically significant associations for the distal colon and rectum compared to slightly attenuated hazard ratios that were no longer statistically significant for cancer in the proximal colon.

Table 3 Age- and multivariable-adjusted hazard ratios for incident colorectal cancer among subjects diagnosed with diabetes modeling diabetes as a time-dependent variable and considering different durations of time at study exit with a diabetes diagnosis compared to subjects without a diagnosis of diabetes at study exit

Discussion

A growing body of evidence links diabetes to colorectal neoplasia and cancer with the vast majority of studies finding relative risk estimates ranging between 1.3 and 1.5 for those with compared to those without diabetes [723]. Our results showing a 49% increased risk among women with prevalent diabetes at baseline are consistent with these previous findings.

Not just frank diabetes, but factors related to insulin resistance have also been associated with increased risk of both colorectal cancer and adenomatous polyps [7, 2730]. As the most commonly proposed underlying mechanism explaining the diabetes–colorectal cancer association is hyper-insulinemia, and since individuals are typically hyperinsulinemic before a formal diagnosis of diabetes (i.e., during the pre-diabetic stage), we were concerned that classifying as exposed (i.e., hyperinsulinemic) only those with prevalent diabetes at baseline might result in important misclassification of exposure for those who would be diagnosed during follow-up (and were thus likely hyperinsulinemic at baseline). But rather than increasing the risk estimate, including the likely hyperinsulinemic, pre-diabetic subjects in the exposed category resulted in an attenuation of the hazard ratio. One possible explanation for this result is that the pre-diabetic women had not had elevated insulin either long enough or intensely enough to increase risk as did the diabetic women. In this case, adding in the “pre-diabetic” people, using this admittedly imprecise proxy variable, produced greater misclassification than leaving them out. A second possible explanation is that some of the women with incident colorectal cancer could have been too sick or may even have died and thus been unable to report on one of the follow-up questionnaires that they had been diagnosed with diabetes subsequent to baseline. In this scenario, the case group would have under-reported exposure status with the result being an estimate of the hazard ratio that was biased toward the null. We attempted to address this question using only incident cases of diabetes identified at the 1993–1995 questionnaire, but with a limited number of cases available for this analysis, interpretation of the risk estimate is difficult. Using these methods, we could not rule out either of these explanations.

An alternative approach was to consider diabetes a time-dependent variable. In this way, exposure status could change through the follow-up period, and risk estimates would take this change into account, and thus exposure status would be less subject to misclassification. Therefore, we would expect a stronger risk estimate when modeling diabetes as a time-dependent variable, and this is in fact what we observed (RR = 1.62 using time-dependent approach compared to RR = 1.49 when defining exposure as prevalent diabetes at baseline). Perhaps even more interestingly, when we classified subjects by the amount of time exposed to diabetes, we found that compared to women who had never been exposed at their exit date, women who had been exposed either shortly after or long after their diagnosis were at no increased risk. Only women who had been diagnosed between 4 and 8 years, and to a lesser extent, 8–12 years prior to their study exit date were at increased risk. The number of cases in these categories of exposure was small, and the confidence intervals were therefore fairly wide, but the general pattern of the risk estimates (together with the highly significant finding for diabetes as a time-dependent variable) was consistent with the notion that risk increased and fell in the same manner that endogenous insulinemia levels increase and then recede in the natural course of diabetes.

Interestingly, these results were consistent with the results from the analyses using prevalent diabetes at baseline and then at each of the two follow-up questionnaires that showed weaker associations if we included in the exposed category those who were recently diagnosed. The results of the time-dependent analyses would tend to support the conclusion that the attenuation we observed when using prevalent cases after baseline may have been real (i.e., not the result of bias due to differential loss to follow-up) and was instead the result of including people in the “exposed” category who were not exposed long enough to increase risk.

In the few prior studies that have examined the issue, the association between diabetes and colorectal cancer was similarly time dependent, such that the greatest risk was typically found among subjects who had been diagnosed with diabetes for at least 5–10 years. Risk for those with a shorter time since diagnosis of diabetes, while still elevated, was lower than that for individuals who had been diabetic for extended time [10, 14, 22, 23]. Consistent with these observations, Saydah and colleagues, though not looking specifically at time since diagnosis, found people using diabetes medications, perhaps an indicator of long-term or more severe diabetes, also were at highest risk among subjects in the Western Maryland cohort [16]. Similarly, Yang et al. found that chronic insulin therapy to treat diabetes was associated with a hazard ratio of 2.1 (95% CI 1.2–3.4) compared to diabetics without insulin therapy. Each incremental year of insulin therapy had an OR of 1.21 (95% CI 1.03–1.42) [31]. Together, these results suggest that long-term exposure to elevated insulin is an important risk factor.

At initial diagnosis of diabetes, however, the standard of care is not normally insulin therapy but lifestyle modification and/or pharmacological agents such as Metformin, meaning that primarily long-term diabetics with poor glucose control would be exposed to insulin therapy. Thus, it is possible that factors related to poor glucose control, and not the insulin that the diabetic patients received, are responsible for the increased risk seen in those studies. In other words, it could mean that there is something else other than, or in addition to, hyperinsulinemia that explains the significant, increased risk for colorectal cancer we observed for diabetes.

Some authors have called into questioned the role of hyperinsulinemia entirely and have proposed the inflammatory response to obesity as the more plausible explanation for the obesity and colorectal cancer association [32]. In this case, they describe hyperinsulinemia and diabetes as epiphenomena, common outcomes, that appear along side the inflammatory response to obesity, and it is the latter that they propose is the true mediator of the association between obesity and colorectal cancer. If this theory is correct, then perhaps it should not be surprising that hyperinsulinemia (indicated in our analysis by pre-diabetes or a diagnosis of diabetes during the follow-up period) was not implicated in our study as it would only be the inflammatory process that mattered. Countering this argument, however, are the findings from our analysis and from both Hu et al. and Limburg et al. showing that a peak of risk for colorectal cancer was roughly 8–10 years from diagnosis of diabetes with diminishing risk in subsequent years [10, 14]. After extended periods of insulin resistance in patients with diabetes, beta cell exhaustion results in gradually declining insulin production. That risk of colorectal cancer peaks at 8–10 years and then declines, just as insulin concentrations peak and then decline with diabetes would be consistent with the idea that insulin per se is an important etiological agent.

In a study of diabetes as an exposure, the question of detection bias is an important consideration given the greater involvement with the health care system that individuals with diabetes have. Controlling for colorectal cancer screening, however, made no noticeable difference in our results, and tests of interaction provided no evidence of effect modification by screening in stratified analyses (data not shown) suggesting there was no detection bias. Another potential limitation of our analysis is the lack of a direct measure of insulin resistance among pre-diabetic individuals. We had no access to records that would have indicated whether any of the subjects who became diabetics during the follow-up period were in fact hyperinsulinemic at baseline. The assumption that such individuals were hyperinsulinemic is not unreasonable, though, since diabetes is a disease process that extends over years and is preceded by extended periods of hyperinsulinemia.

In summary, we found a 49% increased risk of incident colorectal cancer during an 8.4-year follow-up period among women who reported a diagnosis of type 2 diabetes compared to women who were diabetes free at baseline. Given that hyperinsulinemia is the commonly argued mechanism by which diabetes increases risk of colorectal cancer, we also considered the dynamic nature of the risk associated with diabetes. In models that used a time-dependent diabetes variable, the risk estimates were even stronger and indicated a 60% increased risk of colorectal cancer among women with a diagnosis of diabetes. Most interestingly, the elevated risk of colorectal cancer only appeared 4 years subsequent to diagnosis and then diminished rapidly after 8 years from diagnosis. This result, cancer risk rising to a peak and then falling just as insulin exposure rises in the first several years of diabetes and then falls after pancreatic exhaustion, is consistent with the theory that hyperinsulinemia is an important factor that can explain, at least in part, the association of diabetes with colorectal cancer.