Introduction

Diabetes mellitus affects 23 million people in the United States [1] and is present in approximately 9–23% of new colorectal cancer cases [2,3,4,5]. Diabetes shares risk factors with, and may increase the risk of, colorectal cancer [6,7,8,9,10,11,12,13]. People with diabetes have a higher risk of death due to colorectal cancer than persons without diabetes [13,14,15,16,17,18,19]. Hyperglycemia, hyperinsulinemia, and certain treatments for diabetes (e.g., sulfonylureas and insulin) may play roles in colorectal cancer development [6]. In contrast, metformin, the first-line oral agent for diabetes treatment, is hypothesized to have anticancer effects primarily by lowering levels of circulating insulin and by inhibiting a major protein synthesis/cell growth pathway [20, 21], though other mechanisms are also hypothesized [22].

The association between diabetes and colorectal cancer recurrence is unclear. A 2013 meta-analysis [23] of three studies [24,25,26] that included 429 diabetic colorectal cancer patients reported a pooled hazard ratio (HR) of 1.24 (95% confidence interval [CI] 0.99–1.55) for the association between pre-existing diabetes and colorectal cancer recurrence. A subsequent study reported a non-significant association between diabetes and recurrence among 2,183 colon cancer patients, of whom 288 had diabetes (HR 1.32; 95% CI 0.95–1.76) [27] while two other similarly sized studies reported point estimates near 1.0 [5, 28]. A 2017 systematic review did not identify additional studies [29]. To our knowledge, there are no published studies on diabetes and colon cancer recurrence in population-based cohorts in the United States. Thus, we undertook an analysis to evaluate the association between pre-existing diabetes and colon cancer outcomes (recurrences and subsequent cancers overall) in a population of early-stage colon cancer survivors.

Materials and methods

Setting and overview

We conducted a cohort study (Recurrence of Colon cancer in Relation to Drug use [RECORD]) at two health care systems within the Health Care Systems Research Network’s (HCSRN) Cancer Research Network [30]: Kaiser Permanente Washington (KPWA) and Kaiser Permanente Colorado (KPCO). All study procedures were approved by the KPWA institutional review board, to which KPCO ceded.

Both KPWA and KPCO store administrative and clinical information locally in a common data model, the HCSRN Virtual Data Warehouse (VDW) [31]. The VDW is a decentralized data model with mutually agreed upon variable definitions across HCSRN sites. The VDW includes data on enrollment in the health care system, diagnoses and procedures [International Classification of Diseases (ICD) and Current Procedural Terminology (CPT) codes], outpatient prescription medication fills, laboratory test results, vital signs, and deaths. As part of the VDW specifications, both sites also maintain a table of incident cancer diagnoses populated with data from the system’s own cancer registry (KPCO) or the local Surveillance, Epidemiology, and End Results program (SEER) registry (KPWA).

We identified potentially eligible incident colon cancer cases through the VDW; medical records abstractors then verified eligibility and abstracted detailed information on cancer treatment, patient risk factors (e.g., smoking and over-the-counter medication use), and study outcomes. Abstractors underwent extensive training that involved double reviews of selected records to achieve accuracy and consistency. Every 6 months, abstractors participated in inter- and intra-rater reliability activities for quality assurance. Medical records abstraction occurred from June 2014 through August 2016.

Cohort identification and eligibility

We used VDW data to identify patients aged ≥ 18 years when diagnosed with stage I–IIIA [American Joint Committee on Cancer (AJCC) 6th edition] malignant adenocarcinomas of the colon or rectosigmoid junction during 1995–2014 (Fig. 1). Using VDW data, we excluded patients who were not continuously enrolled in the health plan for at least 12 months before and 3 months after the cancer diagnosis; received a total colectomy prior to diagnosis; were previously or concurrently diagnosed with additional colorectal tumors or metastatic non-colorectal tumors; were diagnosed with additional primary tumors or index cancer recurrence within 90 days of the index diagnosis; or were marked by tumor registrars as “never disease free” after index diagnosis.

Fig. 1
figure 1

Study inclusion and exclusion criteria. Figure shows the steps taken to create a study cohort of stage I–IIIA incident colon cancer adenocarcinoma cases at Kaiser Permanente Washington and Colorado, 1995–2014. AJCC American Joint Committee on Cancer, EOT end of treatment, KPCO Kaiser Permanente Colorado, RECORD Recurrence of Colon Cancer in Relation to Drug use, VDW Virtual Data Warehouse

Subjects who were potentially eligible after VDW-based exclusions then underwent medical records review. We further excluded patients with incomplete medical records; whose chart-reviewed diagnosis date fell outside the 1995–2014 study window or differed from the tumor registry date by > 1 month; who were previously diagnosed with colorectal cancer, metastatic non-colorectal cancer, or familial adenomatous polyposis; whose records indicated that the index tumor actually occurred in the rectum or was metastatic; who did not have surgery to treat the index colon cancer; or who had positive surgical margins or tumor deposits. We looked for evidence of tumor progression during the 90 days after end of treatment for the index cancer and excluded those who had evidence of progression of the index cancer or indeterminate imaging results, died or disenrolled, or were otherwise determined to be not cancer-free within that timeframe. Because diabetes diagnosis data were not available on KPCO subjects before 1998, KPCO patients diagnosed with colon cancer before 1 January 1999 were excluded.

Outcome ascertainment

The primary outcome of this study was recurrence of the index colon cancer more than 90 days after treatment completion. Treatment completion was defined as the last date of chemotherapy, radiation, or surgery for the index cancer. Recurrence was defined by a clinical diagnosis in the medical record. At KPCO, information on recurrence was preloaded into the medical records abstraction instrument from tumor registry data, then supplemented and/or confirmed by study abstractors. At KPWA, recurrences are not recorded in the SEER registry; therefore, all recurrence information came from medical records abstraction. Both sites collected date of recurrence, pathological consistency with the index cancer (yes/no), and recurrence location [local = anastomosis or site of primary tumor or incision; elsewhere in the colon (including appendix or rectum); regional = in nearby nodes or on the outside of adjacent organs; distant = distant nodes, peritoneum, or inside other organs; or unknown]. For this study, recurrences were ascertained beginning 90 days after the end of treatment for the incident cancer (i.e., the start of follow-up). End of medical records abstraction (and thus end of follow-up) occurred at the earliest of death, disenrollment from health plan, or date of medical records abstraction.

Medical record abstractors also collected data on recurrences of other cancers and supplemented tumor registry data on second primary colorectal cancer diagnoses. Our secondary outcome was any cancer event, defined by a recurrence of any cancer or new primary cancer at any site.

In sensitivity analyses, we varied our definition of recurrence. Post hoc, we re-reviewed medical records of persons who died of colon cancer without a documented recurrence or second primary colon cancer to ensure that no study outcomes were missed. Because many of these cases had a cancer of unknown type shortly before death, we recorded these events to include in a broader definition of recurrence in sensitivity analyses. We also conducted a sensitivity analysis that included second primary colorectal cancer in the definition of recurrence to facilitate comparisons with other studies [24, 25, 28].

Exposure ascertainment

In the main analysis, we defined diabetes status based on diagnosis codes (ICD-9: 249.x, 250.x, 357.2, 362.0x, 366.41, 648.0x; ICD-10: E08x, E09x, E10x, E11x, E13x, O24x) recorded in the 12 months prior to colon cancer diagnosis through 90 days after the end of treatment (i.e., cohort entry). We classified a patient as having diabetes if they had (1) at least one inpatient/emergency department diabetes diagnosis; or (2) two or more outpatient ambulatory visit diabetes diagnoses within a 6-month period. The first date when patients satisfied this criterion was defined as their date of diabetes diagnosis. In secondary analyses, diabetes was a time-varying exposure that was ascertained through the end of follow-up. We did not distinguish between type 1 and type 2 diabetes.

In a secondary analysis, we analyzed persons with advanced diabetes, defined by use of insulin at cohort entry or most recent hemoglobin A1c (HbA1c) value (from 1 year before colon cancer diagnosis to cohort entry) being ≥ 9 mg/dL. Insulin prescriptions, which were assumed to run out after 6 months, were ascertained from the VDW pharmacy dispensings table. HbA1c values were ascertained from the VDW laboratory results table and/or legacy local data sources available beginning in 1995 at KPWA and 2000 at KPCO. Duration of diabetes was not available.

Covariates

The VDW served as the data source for demographic characteristics (i.e., sex, race, and ethnicity) and for year and stage of the index colon cancer diagnosis. Diagnosis codes were used to calculate Charlson comorbidity scores in the year before colon cancer diagnosis [32]. Persons were classified as having hypertension or hypercholesterolemia as of their first diagnosis code between 12 months before colon cancer diagnosis and the end of follow-up (Fig. 2). Height, weight at diagnosis, cancer treatments, aspirin use, and use of other non-steroidal anti-inflammatory medications were extracted from the VDW and supplemented by medical record review. Medical record reviewers also abstracted patients’ smoking status at and after diagnosis.

Fig. 2
figure 2

Timing of variable assessment. Figure shows when exposure, outcome, and covariate data were collected relative to colon cancer diagnosis and cohort entry

For descriptive purposes, we also extracted information on diabetes medication dispensings from the VDW pharmacy table. The study pharmacist (DMB) identified the generic names of diabetes medications in each of the following drug classes: metformin, sulfonylureas, insulins, thiazolidinediones, meglitinides, DPP inhibitors, GLP-1 agonists, alpha-glucosidase inhibitors, pramlintide, and combination medications. We extracted all outpatient dispensings for these medications beginning 12 months before colon cancer diagnosis through the end of follow-up. We collected data on statin dispensings in the same way.

Statistical analyses

We compared descriptive statistics for potential confounders by diabetes status at cohort entry and used a Poisson model to compute unadjusted rates of outcomes and 95% CIs stratified by diabetes status. We used Cox proportional hazards models to estimate hazard ratios for outcomes in relation to diabetes at cohort entry. The time scale for these analyses was time since end of treatment plus 90 days. In our primary analysis, estimates were adjusted for sex (male, female), age at index colon cancer diagnosis (natural cubic splines with knots at tertiles), health care system (KPWA or KPCO), diagnosis year (natural cubic splines with knots at tertiles), AJCC 6th edition tumor stage at diagnosis (I, IIA, IIB, IIIA), race (Black, non-Black, other/unknown), body mass index (BMI) at diagnosis (< 25.0, 25.0 to < 30.0, ≥ 30.0 kg/m2), and smoking history at and after diagnosis (time-varying: ever, never). In sensitivity analyses, we also adjusted in a time-varying fashion for the following covariates at or after cohort entry: statin use (any, none), aspirin use (any, none), hypertension diagnosis, and hypercholesterolemia diagnosis. These variables could have been recorded before or after a patient’s diabetes diagnosis in the medical record. We did not include these variables in our main model because of the possibility that they might be in the causal pathway between diabetes and recurrence. For comparison, minimally adjusted models included only age, diagnosis year, study site, and sex.

In analyses of colon cancer recurrence (the primary outcome), subjects were censored at the earliest of disenrollment from the health care system, death, second primary cancer at any site (including colon), recurrence of a non-colon cancer, or date of medical records abstraction. In the analysis of the secondary outcome, which was the composite endpoint of any cancer event, subjects were censored at disenrollment from health care system, death, or date of medical records abstraction.

We tested the proportional hazards assumption in the fully adjusted models of all four outcomes (primary outcome, secondary outcome, and two sensitivity analyses in which we varied the definition of recurrence as described above) by including an interaction term between diabetes at cohort entry and the log of analysis time. The proportional hazards assumption was satisfied for the primary and secondary analyses but did not hold for the two sensitivity analyses. Thus, we visually examined stratified cumulative hazard plots over time to determine if hazards were approximately proportional within discrete intervals of follow-up time. We then fit separate Cox proportional hazards models to each of the two identified time periods and found that, consistent with the original analysis, neither HR was statistically significant. Thus, we report a single HR over the entire study period.

In exploratory analyses, diabetes status was analyzed as time-varying in a Cox proportional hazards model. Patients who had no evidence of diabetes at cohort entry (i.e., unexposed) could become exposed if they met the diabetes definition outlined above during follow-up. However, once a person was classified as having diabetes, they could never become unexposed. In our analysis of patients with advanced diabetes, patients without diabetes served as the reference group.

A statistically significant difference in all comparisons was defined as p value < 0.05. All analyses were conducted in SAS version 9.4 (SAS Institute, Cary, North Carolina).

Results

Study population

We identified 3,326 patients aged ≥ 18 years when diagnosed with stage I–IIIA malignant adenocarcinomas of the colon or rectosigmoid junction during 1995–2014. We excluded 631 people based on VDW data and 656 people during chart review (Fig. 1). A total of 2,039 people remained eligible after medical records abstraction (Fig. 1). An additional 116 patients diagnosed before 1 January 1999 at KPCO were excluded, leaving 1,923 eligible for this analysis.

At cohort entry, 393 (16.7%) of patients had diabetes, and an additional 100 patients (for a total of 25.6% of the cohort) developed diabetes during study follow-up. Patients with diabetes at cohort entry were more likely to be male, Hispanic, and Black or of other/unknown race compared to patients without diabetes (Table 1). Compared to people without diabetes, patients with diabetes were more likely to have a higher BMI and a greater comorbidity burden at diagnosis. People without diabetes were more likely than those with diabetes to be diagnosed with colon cancer by screening (rather than symptoms) but had less imaging on average (not shown). Most cancer characteristics—including age at diagnosis, stage at diagnosis, tumor location, tumor grade, and having positive lymph nodes—did not differ markedly based on diabetes status. However, patients with diabetes tended to have larger tumors than patients without diabetes. Approximately 90% of patients were diagnosed with stage I or IIA colon cancer, with the majority having grade I or II cancer, tumors < 5 cm, and no positive regional lymph nodes. A similar proportion of patients with and without diabetes received chemotherapy and radiation, and time from diagnosis to cohort entry (mean 137 days, standard deviation 72 days) was similar in the two groups. Statin and aspirin use in the year before colon cancer diagnosis were both more common in patients with diabetes compared to patients without diabetes. Overall cohort characteristics by study site are shown in Online Appendix Table 1. Sixty-eight percent of diabetes patients were taking at least one type of diabetes medication between end of treatment and cohort entry, inclusive (Table 2). Most patients had HbA1c < 8. There were 135 patients with HbA1c ≥ 9 mg/dL in the year before colon cancer diagnosis to cohort entry or were using insulin in the 90 days prior to cohort entry.

Table 1 Colon cancer patient characteristics by diabetes status at cohort entry
Table 2 Diabetes treatment and control among patients with diabetes at cohort entry (N = 393)

Risk of recurrence and subsequent cancer events

Over a median 4.7 years of follow-up (interquartile range 2.1–8.4 years), there were 139 colon cancer recurrences (12.8 recurrences per 1,000 person-years, 95% CI 10.8–15.2). Diabetes at cohort entry was not associated with colon cancer recurrence (HR 0.87; 95% CI 0.56–1.33) or subsequent cancer events overall (HR 1.09; 95% CI 0.85–1.40) in multivariable models (Table 3). Of the 427 first subsequent cancer events, 139 were colon cancer recurrences, 36 were second primary colorectal cancers, 28 were non-colorectal cancer recurrences, 210 were non-colorectal primary cancers, and 14 were cancers of unknown type.

Table 3 Association between diabetes at cohort entry and risk of colon cancer outcomes

Our findings for recurrence were generally consistent across sensitivity analyses. When diabetes was modeled through the end of study follow-up in a time-varying manner, there was no associated increased recurrence risk (HR 0.84; 95% CI 0.56–1.23, Online Appendix Table 2). Advanced diabetes at cohort entry was not significantly associated with recurrence; however, the hazard ratio (HR 1.17; 95% CI 0.62–2.21) was higher than for diabetes overall.

When second primary colorectal cancers were included in the definition of recurrence, we observed a diabetes-associated increased risk in the minimally adjusted model (HR 1.41, 95% CI 1.01–1.96). The increased risk was not significant in the fully adjusted or additionally adjusted models, but the point estimates were similar.

Discussion

In research published to date, diabetes is consistently associated with colorectal cancer incidence, and, in some studies, with colorectal cancer fatality [23, 33]. Thus, diabetes could plausibly increase the risk of colon cancer recurrence. A 2013 meta-analysis reported a near-significant pooled relative risk, and there was some suggestion (though not significant) of an increased risk of recurrence with diabetes among colon cancer patients in a Korean hospital (HR 1.32; 95% CI 0.98–1.76) [27]. However, the body of literature on this topic is relatively small, and no studies have been conducted in population-based (i.e., non-trial) settings in the United States, where diabetes severity and treatment may differ from other countries.

In our medical records-based cohort study, we did not observe an association between diabetes and colon cancer recurrence. Our results may provide some reassurance to patients with both diabetes and colon cancer. However, we cannot completely exclude an increased risk of colon cancer recurrence with diabetes. First, our confidence intervals were wide because recurrence is a relatively rare event. Second, though not significant, our point estimate for the association between diabetes and the composite outcome of colon cancer recurrence, second primary colorectal cancer, or cancer of unknown type (1.29)—though not significant—was similar to the estimates in the Mills et al. meta-analysis (1.27) and the Jeon et al. study (1.32). The consistency of this association across studies makes it difficult to rule out the possibility of an association.

Our study had several important strengths. We examined recurrence as an outcome in a population-based cohort in the United States, which can be challenging because recurrence is not routinely collected by population-based cancer registries [34]. We had access to information on recurrence in this study through medical records and were able to exclude people with signs of disease progression before and shortly after the end of treatment. Recall bias was not a concern because we used extensive medical records and administrative data to ascertain exposure, confounders, and outcomes. Participation bias was avoided because everyone meeting eligibility criteria based on medical records abstraction was included.

Several limitations are worth noting, however. We were unable to investigate associations between diabetes medications and colon cancer recurrence, as we had initially planned, due to limited power. We had extensive data from medical record review and controlled for the strongest predictors of recurrence (e.g., stage), but were not able to control for lifestyle factors. The degree of confounding by such factors is likely to be small after controlling for BMI, hypercholesterolemia, and hypertension, as we did. However, we may not have been able to completely control for aspirin use. We collected aspirin use from prescription fills and medical records abstraction, but we noted variation within and between medical records reviewers on this variable during our quality assurance assessment.

Our study does not suggest an increased risk of colon cancer recurrence in patients with diabetes. However, power was limited and given the results of our sensitivity analyses, and the literature on this topic as a whole, there is need for additional research on diabetes and colorectal cancer outcomes—especially in large populations in which the association with diabetes medication use can be studied and in which the risk of second primary colon cancers can be further investigated.