Introduction

Colorectal cancer is a major public health concern with over 1.2 million cases and approximately 609,051 deaths globally in 2008 [1]. Worldwide, incidence and mortality rates from colorectal cancer have been on the rise [2] and in China, incidence has consistently increased over the past two or three decades [35]. In Shanghai from 1973 to 2005, the age-adjusted incidence rates increased from 6.09 to 14.70 per 100,000 males for colon cancer and from 7.68 to 11.45 per 100,000 males for rectal cancer [3]. It has been suggested that this rise in incidence can be attributed to the rapid economic development China has experienced since the late 1970s and the resultant increased exposure to the Western diet and lifestyle [35]. Research on possible associations between potentially modifiable factors, such as diet, with colorectal cancer is essential, if we are to determine an appropriate strategy for primary prevention of colorectal cancer.

The suspected links between fruit and vegetable consumption and colorectal cancer risk have long been investigated, but the evidence has been inconsistent [6]. The results have been so inconsistent that the 2007 World Cancer Research Fund and the American Institute for Cancer Research determined that current evidence suggests that the association of almost all fruits and vegetables with colorectal cancer risk is only “limited suggestive” [7]. The association may vary by subsite within the colon and rectum due to etiological differences, which might explain some of the differences in findings across studies [812]. A recent meta-analysis found that fruits and vegetables had a significant inverse association with colon cancer, but not rectal cancer [13]. Additionally, the variability in associations for fruits and vegetables with colorectal cancer between studies may be related to effect modification by other risk factors for colorectal cancer, such as smoking, body mass index (BMI), or physical activity as observed in some previous studies [9, 14].

In this report, we evaluated the association between intakes of fruits and vegetables and the risk of colorectal cancer in the Shanghai Men’s Health Study (SMHS), a large population-based cohort study, analyzing the consumption information both continuously and categorically. In addition, we sought to assess potential interactions of fruit and vegetable intake with smoking status, BMI, and exercise participation.

Methods

Study population

We used data collected for the SMHS with methods that have been described in detail previously [15]. Briefly, the SMHS is a prospective, population-based cohort study in Shanghai, China. Men aged 40–74 years with no previous history of cancer were recruited between March 2002 and June 2006. Of the 82,043 eligible men, 61,482 were included in the cohort for a participation rate of 74.1 %. All participants were interviewed by a trained health professional. The baseline interview obtained information on demographic and lifestyle characteristics, dietary and physical activity habits, and medical history. Anthropometric measurements were taken following a standard protocol. All participants provided written informed consent, and the study received approval from the Institutional Review Boards of Vanderbilt University and the Shanghai Cancer Institute.

We excluded participants who reported consuming an extreme daily total energy intake (<500 or >4,200 kcal; n = 63) and participants with unconfirmed cancer that occurred during follow-up (n = 145), which left 61,274 participants for analysis.

Colorectal cancer ascertainment

Shanghai Men’s Health Study participants were followed up approximately every two to three years for cancer incidence, occurrence of other chronic diseases, and vital status by in-home visits. Annual record linkage with the population-based Shanghai Cancer Registry and the Shanghai Municipal Vital Statistics Unit was also conducted to identify incident cancer cases and decedents, respectively. Incident cancer cases were verified through home visits, and medical charts were obtained to document detailed diagnostic information. Colorectal cancer was defined as a primary tumor with an ICD-9 code of 153 (malignant neoplasm of colon) or 154 (malignant neoplasm of rectum, rectosigmoid junction, and anus). Follow-up data up to 31 December 2010 was included in this analysis.

Fruit and vegetable consumption

Usual dietary intakes of 8 fruit and 38 vegetable items were assessed using a validated food frequency questionnaire (FFQ) at baseline. The SMHS FFQ captured about 89 % of all average food intake in this population [16]. The FFQ assessed how often (daily, weekly, monthly, yearly, or never) the participant consumed a specific food or food group. If the participant had consumed that specific food or food group, he was then asked the amount of consumption for that time period. Then, the average amounts of each food group were calculated by summing the intake for each food item. Nutrient intake was calculated using the Chinese Food Composition Tables [17].

The FFQ was tested for validity and reliability in this population and  the results have been described in detail elsewhere [16]. The correlation coefficients between the estimated intakes of fruits and vegetables from the FFQ compared with that from an average of 12 monthly 24-h dietary recalls were 0.72 and 0.42, respectively. The FFQ data were used to categorize participants into quantiles of intake based on the distribution of consumption at baseline of participants who did not develop colorectal cancer and were left as the original continuous variables to assess potential linear and nonlinear associations. We analyzed the data by total fruit, total vegetable, and total fruit and vegetable intake combined, as well as by five vegetable subgroups (cruciferous, allium, green leafy, legumes, and other), one fruit subgroup (citrus), and one individual fruit category (watermelon) due to its high intake in this population. For the main analyses, all groups were categorized into quintiles, except for allium vegetables, citrus fruits, and watermelon, which were categorized into tertiles due to the low variability of intake. For the analyses of interaction, all groups were categorized into tertiles to keep sufficient sample sizes for each analysis.

Other covariates of interest

Additional variables available for analysis included a number of demographic, dietary, behavioral, and medical factors that were assessed from the baseline questionnaire, the follow-up questionnaire, and/or direct assessment. We selected covariates for adjustment based on the previous literature for their associations with colorectal cancer [9, 13]. Demographic variables of interest were age, education level, occupation, and annual per capita family income. Participants with data missing on education (n = 856; 1.4 %), income (n = 127; 0.2 %), or occupation (n = 69; 0.1 %) were assigned to the most common categories as follows: high school education, income of 6,000–11,999 yuan per year, and occupation as a manual laborer. Each participant’s BMI was calculated from his interviewer-measured height and weight at the baseline visit. Participants with missing data on BMI (n = 35; 0.1 %) were set to the median value of BMI (23.67 kg/m2). For interaction analyses, BMI was categorized as overweight/obese (≥25.0 kg/m2) versus underweight/normal weight (<25.0 kg/m2). Behavioral characteristics under consideration were cigarette smoking, alcohol consumption, and amount of exercise per week [metabolic equivalent (MET) h/week] and obtained from the baseline questionnaire. The sole participant missing data on cigarette smoking and alcohol consumption was categorized in the most common groups as a current smoker and a never drinker. For the interaction analyses, exercise was categorized as no exercise participation (0 MET h/week) and some exercise participation (>0 MET h/week). We determined history of diabetes mellitus and family history of colorectal cancer from the baseline questionnaire. Participants with missing data on family history of colorectal cancer (n = 36; 0.1 %) were assumed to have no such family history. Dietary characteristics of interest were red meat, total meat, and total energy intakes, which were all derived from the FFQ.

Statistical analysis

We calculated age-adjusted descriptive statistics by colorectal cancer case status. We applied Cox proportional hazards regression analysis to derive the hazard ratios (HRs) and 95 % confidence intervals (95 % CIs) to estimate the relative risk of colorectal cancer by quantiles of according to total fruit, total vegetable, total combined fruit and vegetable, cruciferous vegetable, allium vegetable, green leafy vegetable legume, other vegetable, citrus fruit, and watermelon intakes with adjustment for age and total energy intake and other potential confounders. In the Cox regression analysis, the entry time was defined as the age at which the participant was enrolled in the SMHS and the exit time was the age at which the participant developed incident colorectal cancer or was censored (i.e., at death, loss to follow-up, or on 31 December 2010, whichever occurred first). To evaluate linear trends, we entered the median level of intake for each fruit and/or vegetable category by quantile into the model as a continuous variable. We evaluated the proportional hazards assumption by including an interaction term between the fruit and/or vegetable categories with the logarithm of time. No significant interactions were observed, indicating that the proportional hazards assumption was not violated.

To determine whether the association between the quantiles of intake and colorectal cancer risk was affected by undiagnosed or prevalent colorectal cancer, we repeated the initial analyses excluding the first year of follow-up. We also carried out analyses by excluding participants who reported having a large increase or a large reduction in the intake of fruits and vegetables over the past 5 years. Since fruit and vegetable intake may differentially affect the risk of colon or rectal cancer by specific risk groups, we assessed interactions between the fruit and vegetable groups and smoking status (ever vs. never), BMI (overweight/obese vs. underweight/normal weight), and exercise participation (none vs. at least some) by including an interaction term in the Cox model for the occurrence of colon and rectal cancers. The interaction was tested using the likelihood ratio test. We also created stratified estimates for fruit and vegetable intake by smoking status, BMI, and exercise participation categories. In order to assess the potential linear association between fruits and/or vegetables and the risk of colorectal cancer, we analyzed fruit and vegetable intake using the original continuous data by 20 g/day increment in the Cox regression analysis. The 20 g/day increment was selected as a realistic change in intake for the various fruit and vegetable categories. We conducted penalized spline regression analysis to test nonlinearity of the associations. The Akaike information criterion method was used to select the appropriate degrees of freedom for the test of nonlinearity [18]. SAS 9.3 (Cary, NC) was used for all analyses except for the penalized splines models which were created using R 2.15.1 (Vienna, Austria). Statistical significance was set as a two-sided p value <0.05.

Results

After 390,688 person-years of follow-up and a median follow-up time of 6.3 years, 398 cases of colorectal cancer were observed. Of these cases, 236 cases were cancer of the colon and 162 cases were cancer of the rectum. Descriptive statistics by colorectal cancer case status are presented in Table 1. Age was highly associated with colorectal cancer case status (p < 0.01), with cases appreciably older than noncases. After adjustment for age, colorectal cancer cases were similar to noncases for mean consumption of fruits and vegetables and individual fruit and vegetable categories, education, income, occupation, cigarette smoking, alcohol consumption, exercise participation, total energy, red meat and total meat intake, history of diabetes, and family history of colorectal cancer (p > 0.05). However, colorectal cancer cases had a higher average BMI (24.24 vs. 23.72; p < 0.01) than noncases.

Table 1 Baseline characteristics by colorectal cancer case status of the SMHS participants (n = 61,274)

For the risk of colorectal, colon, and rectal cancers by categories of fruits and vegetables, many estimates were less than one, but few reached statistical significance. Similarly, most of the tests for trend were not statistically significant. An inverse association was observed between total combined fruit and vegetable intake and colorectal cancer with a potential dose–response effect (fifth vs. first quintile HR 0.71; 95 % CI 0.50, 1.01; p trend = 0.09), whereas there appeared to be no association between total vegetable intake and colorectal cancer (fifth vs. first quintile HR 1.00; 95 % CI 0.72, 1.41; p trend = 0.83). The associations between quintiles of fruit (fifth vs. first quintile HR 0.67; 95 % CI 0.48, 0.95; p trend = 0.03) and watermelon intake (third vs. first tertile HR 0.77; 95 % CI 0.59, 0.99; p trend = 0.04) with colorectal cancer risk reached statistical significance. The association between total combined fruit and vegetable intake and colon cancer (fifth vs. first quintile HR 0.69; 95 % CI 0.43, 1.09; p trend = 0.16) and total fruit intake and both colon (fifth vs. first quintile HR 0.76; 95 % CI 0.49, 1.20; p trend = 0.14) and rectal cancers (fifth vs. first quintile HR 0.56; 95 % CI 0.33, 0.97; p trend = 0.11) suggested a possible inverse dose–response association, but were not significant. In general, all categories of fruit (citrus fruits and watermelon) were inversely associated with colorectal, colon, and rectal cancers, whereas the legumes group was the only vegetable category which showed an inverse association with colorectal, colon, and rectal cancers (Table 2). Multivariable-adjusted models that excluded the first year of follow-up, in general, yielded similar results (results not shown), so the remaining analyses utilized data from all years of follow-up. After exclusion of participants who reported a substantial increase or decrease in the consumption of fruits and vegetables over the 5 years before the baseline interview, the pattern of the associations remained similar (results not shown).

Table 2 Hazard ratios for associations between the intakes of various fruits and vegetables and colorectal cancer incidence in the SMHS (n = 61,274)

When fruit and vegetable consumption was analyzed continuously (for a 20 g/day change), a marginally significant inverse linear association was observed between fruit intake and colon cancer (HR 0.98; p = 0.06), fruit intake and rectal cancer (HR 0.97; p = 0.06), and watermelon intake and rectal cancer (HR 0.96; p = 0.06). A significant positive association was observed between allium vegetable intake and rectal cancer (HR 1.14; p = 0.04) (results not shown). Penalized spline models gave no indication for a nonlinear association for any of the fruit and vegetable categories (results not shown).

Statistical interactions were observed for the risk of colon cancer between allium vegetables and BMI (inverse association only for overweight/obese individuals; p interaction = 0.03), citrus fruits and exercise participation (inverse association mainly among individuals with no exercise participation; p interaction = 0.02), and green leafy vegetables and exercise participation (inverse association only among individuals with at least some exercise participation; p interaction < 0.01). For the risk of rectal cancer, statistical interactions were observed between watermelon and BMI (inverse association only for overweight/obese individuals; p interaction = 0.03), allium vegetables and exercise participation (inverse association mainly among individuals with at least some exercise participation; p interaction = 0.05), and citrus fruits and exercise participation (inverse association only among individuals with at least some exercise participation; p interaction = 0.05) (results not shown). When total fruit and vegetable intakes were stratified by BMI, physical activity, and smoking status, fruit intake showed an inverse association with the risk of rectal cancer, but only among overweight or obese participants (third vs. first tertile HR 0.28; 95 % CI 0.13, 0.60). Total combined fruit and vegetable intake also appeared to have an inverse association with rectal cancer only among individuals with at least some exercise participation (third vs. first tertile HR 0.54; 95 % CI 0.29, 1.02), while total fruit intake had an inverse association with colon cancer risk only among ever smokers (third vs. first tertile HR 0.59; 95 % CI 0.37, 0.95) (Table 3).

Table 3 HRs stratified by BMI, exercise participation, and smoking for associations between fruits and vegetables with colon and rectal cancer incidence in the SMHS (n = 61,274)

Discussion

In this prospective cohort study of men in Shanghai, China, we found an inverse association between fruit intake and the risk of colorectal, colon, and rectal cancers. There was little evidence for an association between total vegetable intake and colorectal cancer, although an inverse association was observed for the intake of legumes. When data from the first year of follow-up or participants who reported a large change in fruit or vegetable intake were excluded, the estimates of the association patterns were largely unchanged. Some statistical interactions were observed between the fruit and vegetable categories with BMI, smoking, and exercise participation, but these findings should be interpreted with caution as they may have resulted from multiple comparisons.

A recent meta-analysis, which included 22 publications, all of which were cohort studies, calculated summary relative risk estimates (RR) of 0.92 (95 % CI 0.86, 0.99) for colorectal cancer, 0.91 (95 % CI 0.84, 0.99) for colon cancer, and 0.97 (95 % CI 0.86, 1.09) for rectal cancer for the association between the highest and the lowest categories of intake of total combined fruit and vegetable intake. These estimates were similar for fruit and vegetable intakes considered separately. When the data were stratified by the geographic location of the studies, the summary RRs were 1.17 (95 % CI 0.94, 1.45) for total combined fruit and vegetable intake, 1.00 (95 % CI 0.79, 1.28) for total fruit intake, and 1.02 (95 % CI 0.89, 1.18) for total vegetable intake in Asian studies [13]. The null finding of our study for vegetable intake, thus, is in general agreement with findings from these Asian studies [1922]. The meta-analysis also found an indication of a nonlinear inverse association between fruit and vegetable intake with colorectal cancer where the risk reduction was strongest for increases from very low levels of fruit and vegetable intake [13]. Our population, like many other Asian populations, consumes fairly high levels of vegetables, with a mean of approximately 344 g/day (inter-quartile range 212.6–429.4 g/day), which may explain why we did not find a significant inverse effect in our study, since our study had very few participants who consumed low levels of vegetables. In comparison, a randomly selected subcohort of men in a Dutch cohort study reported consuming 187.1 g vegetables/day [10]. Additionally, the length of follow-up for this study was also not as long as some of the studies included in the meta-analysis [13].

For the associations between subgroups of fruit and vegetables and colorectal, colon, and rectal cancers, the results from previous studies have been inconsistent. A number of studies did not find any independent associations between cruciferous vegetable intake with colorectal cancer risk [8, 11, 2328]; however, a recent meta-analysis found a significant inverse association with a pooled relative risk of 0.82 (95 % CI 0.75, 0.90) for the highest versus the lowest category of intake [29]. Similarly, a meta-analysis found that increased garlic consumption, an allium vegetable, significantly decreased the risk of colorectal cancer with a pooled relative risk for the highest versus the lowest category of intake of 0.66 (95 % CI 0.48, 0.91) [30]. However, the majority of studies included were of case–control design, and therefore, the pooled estimate may have been affected by recall bias. And a recent case–control study did not observe a significant association between garlic intake and colorectal cancer risk [8]. Similarly, no association between onions or leeks, which are allium vegetables, with the risk of colon or rectal cancers was observed in a prospective cohort study [31]. No consistent association has been observed between legumes and green leafy vegetables on the risk of colorectal cancer [8, 10, 12, 27, 28], although a few studies have observed an inverse association for one or both of these vegetable categories [8, 11, 28]. Citrus fruit has also not been strongly associated with the rate of colorectal cancer [8, 10, 11, 27, 28]. Few studies individually assessed the association between watermelon intake and colorectal cancer, although the association with lycopene, which is found mainly in tomatoes but is also found in watermelon, has been inconsistent [3234].

Our study is not without limitations. First, all of the fruit and vegetables intakes were assessed using an FFQ which may not be accurate at estimating the actual amount of dietary intake. However, in a validation study, the FFQ tended to be relatively accurate for fruit intake with some overestimation for the intake of vegetables [16] and FFQs are generally useful for ranking intake which was our main analytic technique in this analysis. We excluded participants who had extreme energy intake in order to remove participants who may not have been accurately reporting nutritional intake. Second, this study was underpowered to detect modest associations. However, the analyses treating fruit and vegetable intake as a continuous variable, which tend to have more power, found similar results as indicated in the categorical analysis. Finally, although we adjusted for a number of confounders, we cannot rule out residual confounding by unmeasured or unadjusted factors.

This study has a number of important strengths. First, the SMHS is a rigorously designed cohort study with high participation and retention rates. Second, all covariates used in our analyses were assessed prior to the development of any cancer, thereby decreasing the potential for misclassification bias. Third, we determined that prevalent cancer was unlikely to have affected the results because after excluding the first year of follow-up, our results were unchanged. Finally, results of the many secondary analyses that we conducted yielded similar results, which suggests that our findings are robust.

In conclusion, we found that fruit consumption was inversely associated with the risk of colorectal cancer while vegetable intake was largely unrelated to colorectal cancer risk. Given that few individuals consumed low levels of vegetables in our and other Asian studies, pooling data from studies within Asian populations may be necessary to clarify the effect of low vegetable intake on colorectal cancer risk. Additionally, effect modification by other risk factors, such as BMI, exercise participation, and smoking should be considered for comparison with our findings, particularly in Asian populations.