Introduction

Colorectal cancer is a global public health problem, with an estimated 1.4 million cases diagnosed worldwide in 2012. It was the second most commonly diagnosed cancer in men and the third most commonly diagnosed cancer in women globally in 2012 [1]. There is evidence that diet and lifestyle changes may play an important role in the primary prevention of colorectal cancer [2,3,4]. Several individual foods and nutrients have been widely studied in relation to colorectal cancer risk. For example, higher intake of whole grains, foods containing dietary fiber, calcium supplements, and dairy products have been shown to be protective, while higher intake of red meat, processed meat, and alcohol drinks is associated with higher risk [4]. However, given the complex interaction of multiple foods and nutrients in diet, estimating the effect of an individual dietary factor is challenging; foods are generally eaten in combination and changes in the intake of one food or nutrient is likely associated with changes in the intake of other foods and nutrients. Accounting for this complex interaction is difficult in studies of single dietary factors; therefore, the examination of dietary patterns in relation to disease outcomes is an important complementary approach [5].

More recent studies in nutritional epidemiology have adopted the dietary patterns approach which describes the overall diet, including foods, food groups, and nutrients; as well as their combination, variety, frequency, and quantity of habitual consumption [5]. However, the colorectal cancer section of the World Cancer Research Fund and the American Institute for Cancer Research Continuous Update Project Report published in August 2017 concluded that although there is convincing or probable evidence for the association of several individual dietary factors and colorectal cancer risk, the evidence for the association between dietary patterns and colorectal cancer risk is limited and inconclusive [4]. Most previous reviews and meta-analyses of the association between dietary patterns and colorectal cancer risk included relatively few studies (ranging from 6 to 33), focused on specific definitions of dietary patterns (index-based or empirically derived patterns—more details in the Methods section), and did not consider the diversity of the international populations [6,7,8,9,10]. A critical synthesis of the component foods in the identified dietary patterns was also lacking in previous reviews and meta-analysis. Additionally, several new original studies have been published after the previous reviews and meta-analyses.

The objectives of the present review were threefold: first, we synthesized data from studies published over the 17-year period from 2000 to 2016, including cohort and case-control studies using index-based and empirically derived dietary patterns. Second, we further synthesized the food components of the index-based dietary patterns and the empirically derived patterns to identify foods that may be common across different dietary patterns; and third, we examined trends by sex, study design, region of the world where the dietary pattern was derived, and by anatomic subsite of cancer (proximal colon, distal colon, and rectum).

Methods

Article Search Strategy

We conducted a literature search in the PubMed database for articles published between January 2000 and February 2017–an extended period compared to most previous reviews and meta-analyses. Few studies of the association between dietary patterns and colorectal cancer risk were published prior to 2000. We used the following search terms, individually and in combinations: dietary patterns, dietary quality, food patterns, dietary score, dietary index, healthy eating index, alternative healthy eating index, Mediterranean dietary score, alternative Mediterranean dietary score, dietary approaches to stop hypertension, dietary inflammatory index, factor analysis, principal components analysis, cluster analysis, healthy dietary pattern, prudent dietary pattern, Western dietary pattern, colorectal cancer, colorectal neoplasm, colon cancer, colon neoplasm, rectal cancer, and rectal neoplasm. Additionally, we searched the reference lists of the articles obtained to further identify other pertinent articles. We included articles with colorectal, colon, and/or rectal cancers as study outcomes. Though colorectal adenomas are known precursors of colorectal cancer [11], the outcome of interest in the current review was colorectal cancer and we therefore did not include studies with adenomas as an outcome but refer to two recent publications that focused on adenoma as outcome [12, 13].

We identified and reviewed a total of 49 original publications. The article selection process is outlined in the PRISMA (preferred reporting items for systematic reviews and meta-analyses) flow chart in Fig. 1. The information extracted from each study is presented in Supplemental Tables 1 and 2. Most studies derived dietary patterns using dietary data from food frequency questionnaires and few studies used diet history questionnaires. Studies are divided into two main categories per the method of deriving dietary patterns: index-based or a priori (Supplemental Table 1) and empirically derived or a posteriori (Supplemental Table 2). Studies within each table are further divided by study design, into prospective cohort studies and case-control studies.

Fig. 1
figure 1

Flow chart of the article selection process

Dietary Patterns Derivation Methods

There are three major approaches for deriving dietary patterns: (i) index-based or a priori, (ii) empirically derived or a posteriori, and (iii) empirical hypothesis-oriented which combines features of a priori and a posteriori approaches. Index-based or a priori dietary patterns are derived based on existing scientific evidence linking diet and disease risk. A priori dietary patterns generally take the form of dietary indices constructed based on dietary recommendations or expert synthesis of current scientific evidence on diet and disease risk. The studies reviewed used one or more of the following dietary indices: healthy eating index (HEI-2005 and HEI-2010. No study applied the HEI-2015 in the period covered by this review.) [14], alternative healthy eating index (AHEI-2010) [15], several versions of the Mediterranean dietary pattern score [16, 17], dietary approaches to stop hypertension (DASH) [18], dietary inflammatory index (DII) [19], adherence score to the World Cancer Research Fund/American Institute for Cancer Research 2007 cancer prevention recommendations (WCRF/AICR) [20], and recommended food score [21]. Methods for the derivation of each dietary index are described in detail under results separately.

A posteriori dietary patterns are empirically derived, using statistical exploratory methods for data reduction such as exploratory factor analysis (principal components analysis—PCA) and cluster analysis. The objective of empirically derived dietary patterns is to reveal unobserved dietary profiles that are associated favorably or unfavorably with disease risk in a given study population. The most commonly used method, factor analysis, examines the correlation matrix of food variables in search of underlying traits that explain most of the variation in the data, thus reducing a large number of food variables to a smaller set that captures the major dietary factors in the population [22]. The identified factors are usually orthogonally rotated rendering them statistically uncorrelated with each other. Scores are then obtained to rank individuals based on their level of intake of a specific factor.

In contrast, cluster analysis aggregates individuals (not foods) in a multidimensional space based on the intake of numerous foods, leading to discrete, nonoverlapping clusters which capture the greatest number of subjects, but into which some individuals may not be included. There is variability between groups (clusters) of individuals but not among individuals in the same cluster who may have somewhat different diets [22]. Dietary patterns derived using factor analysis are more popular than patterns from cluster analysis; e.g., of the 25 studies that used a posteriori dietary patterns, only two derived patterns using cluster analysis.

The empirical hypothesis-oriented methods are an emerging approach for creating dietary patterns. The approach applies statistical exploratory data reduction methods in a given study population (similar to a posteriori patterns) based on a specific scientific hypothesis linking dietary behavior and disease risk [23,22,25]. A score is developed as the weighted sum of intakes of the individual foods in the pattern predictive of biomarkers of the hypothesized biological pathway. The validity of the score relative to the underlying hypothesis is evaluated in independent study populations, and the dietary pattern score is then derived and used in different study populations (in the same manner as a priori patterns) to examine its association with disease risk [26••, 27••]. For example, Tabung et al. used reduced rank regression [28] to develop an empirical dietary inflammatory pattern (EDIP) score [26••] and stepwise linear regression analyses to develop empirical indices to assess the insulinemic potential of diet and lifestyle [27••]. These indices may then be used to examine associations with diseases whose development is hypothesized to be mediated through the inflammatory or insulin response pathways respectively. These empirical hypothesis-oriented indices may be applied in a standardized manner across different populations.

Results

Supplemental Table 1: Findings from Index-Based or A Priori Dietary Patterns

During the 17-year period covered by the current review, 24 (17 cohort and 7 case-control) studies examined the association between dietary patterns derived using an a priori diet quality indices and development of colorectal cancer. Findings from these studies are summarized in Supplemental Table 1. The indices included the following: HEI in three studies [29,28,31]; AHEI in two studies [29, 30]; DASH in three studies [29, 32, 33•]; several versions of the Mediterranean dietary pattern in eight studies [29, 30, 32, 34,33,34,35,38]; healthy Nordic food index in one study [39]; recommended food scores in two studies [30, 40]; WCRF/AICR score in four studies [2, 41•, 42, 43]; and dietary inflammatory index in seven studies [44, 45•, 46,45,46,47,50]. We included in this category a study that derived four vegetarian dietary patterns based on a priori criteria [51]. A common feature in most of the a priori patterns is that they emphasize higher intake of fruit, vegetables, nuts and legumes, whole grains, low-fat dairy products, and fish and other seafood, while rewarding lower intake of red and processed meat, sugar-sweetened beverages, alcoholic beverages, and table salt (Table 1).

Table 1 Summary of index components that are common across most of the dietary indices

Healthy Eating Index

The HEI is a measure of diet quality as described by the key dietary recommendations of the Dietary Guidelines for Americans, which are updated every 5 years. Higher scores (range 0–100) indicate higher diet quality [14]. Higher energy-adjusted intakes of fruit, vegetables, legumes, olive oil, whole grains, low-fat dairy products, and lean meat receive higher index scores, whereas lower energy-adjusted intakes of sodium, saturated fat, solid fat, alcoholic beverages, and added sugar result in lower scores.

Vargas et al. analyzed data from the Women’s Health Initiative (WHI), a large US-based cohort of 161,808 postmenopausal women 50–79 years old at enrollment in 40 centers across the USA. They found a 27% lower risk of colorectal cancer, comparing women in the highest to those in the lowest HEI-2010 quintile. Findings for overall colon cancer (no distinction provided between proximal or distal colon) were similar to those for colorectal cancer but there was no significant association for rectal cancer [29]. Reedy et al. used data from the NIH-AARP Diet and Health Study, a large cohort (n = 567,169) of men and women aged 50 to 71 years residing in eight states within the USA. They found a 20% lower risk in women and a 28% lower risk in men, comparing individuals in the highest to those in the lowest HEI-2005 quintile. The inverse association was stronger for distal colon cancer in men and women and for rectal cancer in men, but there was no association with proximal colon cancer risk in men and women, or with rectal cancer risk in women [30]. In a population-based case-control study with 431 colorectal cancer cases and 726 controls resident in an area comprised of 19 counties in Pennsylvania, Miller et al. reported a 44% lower risk in men and a 56% lower risk in women, comparing participants in extreme quartiles of the HEI-2005. The authors did not examine risk separately for each anatomic location of colorectal cancer [31].

Development of the AHEI-2010–a modified version of the HEI-2010–was based on a comprehensive review of the relevant literature to identify foods and nutrients that had been associated consistently with risk of chronic diseases in clinical and epidemiological investigations, including information from the original AHEI. The AHEI-2010 includes 11 components and the total score ranges from 0 (nonadherence) to 110 (perfect adherence) [15]. Higher intakes of fruit, vegetables, whole grains, nuts and legumes, polyunsaturated fat, long chain omega-3 fat, and moderate alcohol intake receive higher index scores, whereas lower intakes of sodium, sugar-sweetened beverages and fruit juice, red/processed meat, trans fat, and no alcohol or high alcohol result in lower scores. The HEI-2010 and AHEI-2010 differ in the way components are scored. While there are components that are specific to each index, similarities include that both indices encourage high intake of fruits, vegetables, legumes and whole grains, and low intake of sodium, saturated fat, trans fat, and red/processed meat.

The two studies that calculated AHEI-2010 scores were prospective cohort studies and reported mixed results. Vargas et al. found inverse but nonsignificant associations between AHEI-2010 scores and CRC risk among the women in the WHI (HR: 0.86; 95%CI: 0.70, 1.07). They also found no association with colon or rectal cancer risk [29]. In contrast, Reedy et al. used the original AHEI in the NIH-AARP study and found a significant 29% lower risk in men (HR 0.71; 95% CI 0.61, 0.82) and a nonsignificant 17% lower risk in women (HR 0.83; 95% CI 0.66, 1.05). The inverse association was stronger for distal colon cancer and rectal cancer risk in men. AHEI scores were not associated with proximal colon cancer in men and women, or with distal colon cancer and rectal cancer in women [30].

Dietary Approaches to Stop Hypertension (DASH)

The DASH score is comprised of eight components: fruit, vegetables, nuts and legumes, low-fat dairy products, and whole grains. Points are awarded according to quintile ranking of participants in the scoring approach developed by Fung et al. [18]. The score similarly rewards lower intake of sodium, sweetened beverages, and red and processed meats. According to this scoring approach, the total possible score range is 8–40 [18]. Several other DASH scoring approaches have been developed and have yielded similar results for colorectal cancer risk [33•]. Results for lower risk of colorectal cancer were consistent among the three studies that calculated DASH scores [29, 32, 33•].

Vargas et al. found a 20% lower risk of colorectal cancer in the WHI. Higher DASH scores were associated with lower colon cancer risk but not with rectal cancer risk [29]. Fung et al. reported a 19% lower risk for men in the Health Professional Follow-up Study (HPFS) and 20% lower risk for women in the Nurse’s Health Study (NHS). The NHS and HPFS are ongoing cohorts in which dietary and other lifestyle data are collected every 2 to 4 years. The NHS (n = 121,701) enrolled female registered nurses aged 30–55 years in 1976, whereas the HPFS (n = 51,529) enrolled male health professionals aged 40–75 years in 1986. The inverse associations for colon and rectal cancers in men and for rectal cancer in women were not statistically significant but when data were pooled combining men and women, these associations all became statistically significant [32]. Miller et al. used data from the NIH-AARP Diet and Health Study to compare the associations of four DASH scores calculated using four different scoring approaches developed by Mellen, Dixon, Fung, and Gunther [33•]. They found that higher scores of all four DASH scores were consistently associated with lower colorectal cancer risk, except that there was no association among women for the Dixon scoring approach [33•].

Mediterranean Dietary Pattern Scores

There are several versions of deriving the Mediterranean dietary pattern scores, including the original pattern [52], the alternative Mediterranean dietary pattern score [17], the Italian pattern [36], the Greek pattern [38], but the typical Mediterranean dietary pattern assesses nine components for a total of 9 points: vegetables, legumes, fruit and nuts, dairy products, cereals, meat and meat products, fish, alcohol, and the ratio of MUFA/SFA. While the food components are largely similar among the different versions, some investigators derive the pattern scores by awarding scores ranging from 0 to 5 or the inverse, for each component [36,35,38]; whereas others award 0 or 1 point based on median intake of the food component in a given population [18, 30]. For most of these components, higher intake is rewarded while lower intake of dairy products, meat, and meat products is rewarded. Findings for the Mediterranean dietary patterns were not consistent across the several studies that examined these patterns in relation to colorectal cancer risk [18, 29, 30, 34,33,34,35,38].

Vargas et al. looked at women in the WHI and found no association (HR 0.91; 95% CI 0.74, 1.11) comparing extreme quintiles of the alternative Mediterranean dietary pattern score [29]. Similarly, Fung et al. found no association in men in the HPFS (HR 0.88; 95% CI 0.71, 1.09) and in women in the NHS (HR 0.89; 95% CI 0.77, 1.01) comparing extreme quintiles of the alternative Mediterranean dietary pattern score [18]. In both Vargas et al. and Fung et al. studies, findings for colon and rectal cancers were similar to those for overall colorectal cancer.

In contrast, Reedy et al. found a 28% lower risk of colorectal cancer in men with highest adherence to Mediterranean dietary pattern score in the NIH-AARP study but no significant association in women [30]. Also, Bamia et al. used data from the European Prospective Investigation into Cancer and Nutrition (EPIC) in ten European countries and calculated overall and center-specific Mediterranean dietary pattern (MED) scores. They found a decreased risk of colorectal cancer, of 8% and 11% when comparing the highest (scores 6–9) with the lowest (scores 0–3) adherence to center-specific and overall scores respectively. For the overall score, the HR was 0.89 (95% CI 0.80, 0.99). These associations were somewhat more evident among women than men, and were mainly manifested for colon cancer risk [35]. Agnoli et al. also calculated MED scores in a smaller sample of participants in the Italian section of EPIC. Higher MED scores were inversely associated with colorectal cancer risk (HR 0.50; 95% CI 0.35–0.71 for the highest category compared to the lowest, P-trend = 0.04), and results did not differ by sex. Highest MED score was also significantly associated with reduced risk of distal colon cancer (HR 0.44, 95% CI 0.26–0.75) and rectal cancer (HR 0.41, 95% CI 0.20–0.81) but not of proximal colon cancer [34].

Three hospital-based case-control studies, two Italian and one Greek, calculated Mediterranean dietary pattern scores and examined associations with colorectal cancer risk [36,35,38]. All three studies reported lower risk of colorectal cancer, colon cancer, and rectal cancer risk, with higher adherence to the Mediterranean dietary pattern [36,35,38]. For example, Rosato et al. found a 48% lower risk of colorectal cancer in a study with 3745 colorectal cancer cases and 6804 matched controls [36]. In another study with 338 colorectal cancer cases and 676 matched controls, Grosso et al. reported a 54% lower risk of colorectal cancer for participants with high scores compared to those with low scores [37]. The case-control study by Kontou et al. used 250 colorectal cancer cases and 250 matched controls and found a 13% lower odds per unit (score range 0–55) increase in Mediterranean dietary pattern score (OR 0.87; 95% CI 0.82, 0.92) [38]. These three studies did not examine risk by anatomic location of the cancer within the colon or by sex.

The Recommended Food Score (RFS)

The RFS includes 23 food items that include apple/pear, cantaloupe, orange, grapefruit, orange/grapefruit juice, other fruit juices, dried beans, tomatoes, mustard/turnip/collard greens, broccoli, spinach, carrots or mixed vegetables with carrots, green salad, sweet potatoes, yams or other potatoes, baked or stewed chicken/turkey, baked/broiled fish, dark breads such as whole wheat, rye, or pumpernickel, cornbread, tortillas/grits, high-fiber cereals, such as bran, granola or shredded wheat, cooked cereals, 2% milk/beverages with 2% milk, and 1% milk/skimmed milk. These food items consumed at least once a week are summed to create the RFS, with a maximum score of 23 [21]. We identified two studies that used the RFS to examine its association with colorectal cancer risk. In a subset of 37,135 women enrolled in the Breast Cancer Detection Demonstration Project (BCDDP) follow-up cohort, Mai et al. found no association between the RFS and colorectal cancer risk (HR 0.94; 95% CI 0.69, 1.27) comparing extreme RFS quartiles [40]. Similarly, Reedy et al. used data from the NIH-AARP study and found no association in women (HR 1.01; 95% CI 0.80, 1.28). However, in men, they reported a 25% lower risk of colorectal cancer, comparing extreme RFS quintiles [30].

World Cancer Research Fund/American Institute for Cancer Research (WCRF/AICR) Cancer Prevention Recommendations

In 2007 the World Cancer Research Fund and the American Institute for Cancer Research (WCRF/AICR) issued ten recommendations for cancer prevention based on the most comprehensive collection of available evidence. These recommendations are in relation to diet, physical activity, body weight, foods and drinks that promote weight gain, plants foods, animal foods, alcoholic drinks, food preservation, use of dietary supplements, and breastfeeding [3]. Romaguera et al. constructed an adherence score to these recommendations and tested its association with colorectal cancer risk using data from the EPIC cohort. EPIC is a large follow-up study involving 521,000 men and women from ten European countries [53]. After a median 11 years of follow-up, almost 37,000 colorectal cancer cases were diagnosed among the 386,355 included men and women. Adherence to the recommendations was associated with a 27% lower risk of colorectal cancer (HR 0.73; 95% CI 0.65, 0.81) comparing the highest adherence category (5–6 for men/6–7 for women) to the lowest category (0–2 for men/0–3 for women) [2]. Using data from the Vitamins and Lifestyle (VITAL) study, a cohort study of dietary supplements and cancer risk, Hastert and White showed that adherence to these recommendations in adults followed for an average of 7.6 years was associated with a lower risk of colorectal cancer. Each 1-point increase in the score conferred a significant 34 to 58% lower risk for adhering to ≥ 1 recommendation or 5–6 recommendations respectively, compared to nonadherence. Corresponding results in women were 26 to 55% lower risk and 39 to 59% lower risk in men [41•]. Nomura et al. and Makarem et al. also calculated recommendation adherence scores in smaller cohort studies but found no association with colorectal cancer risk [42, 43]. None of the four studies examined risk by anatomic location of the cancer. It should be noted that these studies did not examine diet separately, but as one of the recommendations, that also included body weight and physical activity.

Dietary Inflammatory Index (DII)

The DII is a literature-derived nutrient-based index developed to summarize the association between dietary factors and inflammation biomarkers [19]. Details of the development of the DII have been described elsewhere [19]. Briefly, a systematic review of the literature on the relation between 45 dietary factors (mostly nutrients) and inflammation biomarkers was conducted through 2010, and 1943 articles were identified and scored. In scoring the articles, one of three possible values was assigned to each article based on the effect of the particular dietary factor on an inflammation biomarker: +1 if pro-inflammatory, 0 if no change in levels of the inflammation biomarker, and −1 if anti-inflammatory. These scores were then summed across all 45 dietary factors to constitute inflammatory effect scores (or literature-derived weights) to use in weighting actual dietary intake data in the process of calculating DII scores. The 45 DII components include 35 nutrients, green tea or black tea, garlic, onion, turmeric, thyme or oregano, hot pepper, rosemary, ginger, eugenol, and saffron. Higher (more positive) DII scores indicate pro-inflammatory diets, while lower (more negative) scores indicate anti-inflammatory diets [19]. Nutritional supplements influence DII scores given that the index is comprised mostly of nutrients; therefore, the source of dietary data for DII calculation (food sources and/or nutritional supplements) is important. Seven studies (four cohort [46,45,48, 54] and three case-control [44, 49, 50]) examined the association of the DII with colorectal cancer risk during the period covered by this review. Overall, results were consistent that higher (more pro-inflammatory) DII scores were associated with higher risk of colorectal cancer; although there were differences based on whether the DII was calculated from food sources or from a combination of food and supplements. Tabung et al. used diet plus supplement data from the WHI cohort (postmenopausal women) to calculate DII scores and found a positive association between the DII and colorectal cancer risk after an average 11.3 years of follow-up (HR 1.22; 95% CI 1.05, 1.43) [47]. Wirth et al. also calculated DII scores from diet plus supplement data among men and women in the NIH-AARP study but did not find an association with colorectal cancer risk in women (HR 1.12; 95% CI 0.95, 1.31). They however reported a 44% higher risk in men and 40% higher risk in men and women combined, comparing extreme DII quintiles [46].

Studies that have derived separate DII scores with and without inclusion of supplements have found positive associations for DII scores with supplements and no association for DII scores from diet-only sources [48, 54]. For example, in the Iowa Women’s Health Study, Shivappa et al. found a 20% higher risk of colorectal cancer (HR 1.20; 95% CI 1.01, 1.43) that became nonsignificant with the exclusion of supplements (HR 1.12; 95% CI 0.90, 1.38) [48]. However, in a large study conducted using data from the Multiethnic Cohort that included 190,963 men and women with 4388 colorectal cancer cases diagnosed in > 20 years of follow-up, Harmon et al. calculated the DII from diet-only sources and found significant associations as follows: 21% higher risk of colorectal cancer in men and women combined, 20% higher risk of colon cancer, 22% higher risk of rectal cancer, 28% higher colorectal cancer risk in men, and 16% higher colorectal cancer risk in women. Exclusion of cases diagnosed within 3 years from baseline did not materially change the results [45•]. Findings from the three case-control studies were consistent in that higher DII scores were associated with higher odds of colorectal cancer [44, 49, 50].

Other Dietary Patterns Derived Using A Priori Approaches

Additional dietary patterns derived using a priori approaches were the Healthy Nordic food index [39] and some vegetarian dietary patterns [51]. The healthy Nordic food index was based on traditional Scandinavian foods chosen a priori based on expected health benefits. These included fish, cabbage, rye bread, oatmeal, apples or pears, and root vegetables, for a maximum of 6 points. The scoring method used for deriving Mediterranean dietary pattern scores was applied [52], where 1 point was given for an intake equal to or greater than the sex-specific median for each food [39]. Kyro et al. included 55,880 participants (29,216 women and 26,664 men), from the Danish Diet, Cancer and Health prospective cohort study, and reported an inverse association (35% lower colorectal cancer risk) in women but not in men, comparing participants in the highest index category (5–6 points) with the lowest category (0–1 point). They did not observe significant associations for proximal colon, distal colon, or rectal cancer risk [39].

Urlich et al. categorized diet into four vegetarian dietary patterns (vegan, lacto-ovo vegetarian, pescovegetarian, and semivegetarian) and a nonvegetarian dietary pattern using data from the Adventist Health Study-2, which has a substantial proportion of vegetarians. The different vegetarian patterns were defined a priori according to the absence of intake of particular animal foods [51]. They found a 21% lower risk of colorectal cancer comparing all vegetarians to nonvegeterians. This association was driven largely by the pescovegetarian pattern which showed a 42% lower risk (HR 0.58; 95% CI 0.40, 0.84). Pescovegetarians consumed fish ≥ 1 times/month but all other meats < 1 time/month. None of the other vegetarian patterns was significantly associated with colorectal cancer risk [51].

Summary of Findings from Index-Based or A Priori Dietary Patterns

Findings were remarkably consistent across the dietary indices, with greater adherence to recommendations or higher index scores associated with lower risk of developing colorectal cancer. However, findings differed by sex, anatomic subsite, study design, and region where the cohort was located. Associations were more consistently significant and stronger in men than women. Among the studies that reported results by subsite, more studies observed a significant relationship for colon cancer than rectal cancer. Findings also differed by study design, with a higher proportion of case-control studies reporting significant findings and with larger effect sizes compared to prospective cohort studies.

It is important to note the geographic region in which the indices were applied; the HEI, AHEI, RFS, and DASH scores were applied in cohort studies within the USA, whereas the healthy Nordic food index was applied in a European population. The DII, Mediterranean dietary pattern scores and WCRF/AICR scores have been applied to populations in both the USA and Europe. Three cohort studies investigating the WCRF/AICR score in the USA including one study that reported a significant 58% lower CRC risk, and one (cohort) study in Europe that reported a 27% lower risk comparing the highest versus lowest score categories. The four studies that applied the DII in the USA were all cohort studies and reported between 16 and 40% higher risk of colorectal cancer with higher (more pro-inflammatory) scores, while three case-control studies in Europe and South Korea reported higher ORs from 1.55 to 2.16. Most of these studies calculated DII scores from diet plus supplements. For the Mediterranean dietary pattern score, the three cohort studies in the USA found an 11 to 28% lower risk of colorectal cancer, and the three case-control studies in Europe found higher risk reductions ranging from 13 to 54% lower odds of colorectal cancer, comparing the highest to the lowest score category. The differences in effect sizes between regions for the same study design or between study designs make direct comparisons of effect sizes challenging. Despite these differences, the consistently significant findings across regions, using different dietary indices, indicate that higher intake of a “healthy” dietary pattern is associated with lower risk of colorectal cancer even though different geographical regions consume different foods in different amounts.

Supplemental Table 2: Findings from Empirically Derived or A Posteriori Dietary Patterns

We identified 25 studies (11 cohort and 14 case-control) published between January 2000 and February 2017 that used a posteriori or data-driven approaches to derive dietary patterns and evaluate the association between the patterns and risk of developing colorectal cancer. Findings from these studies are summarized in Supplemental Table 2. Two major dietary patterns emerged from our analysis of the foods comprising the dietary patterns derived using the exploratory factor analysis method employed in most of the studies: a “healthy” pattern and an “unhealthy” pattern. Major food components of these two patterns are summarized in Table 2 across ten studies from five world regions including North America, South America, Europe, Asia, and the Middle East. The major food groups in the healthy pattern included fruits and vegetables, nuts and legumes, milk and other dairy products, and some fish/seafood and poultry (Table 2). Overall, findings indicated that higher intake of the healthy pattern was associated with lower risk of colorectal cancer. However, there were differences between prospective and case-control studies, and between prospective studies with large and small number of colorectal cancer cases.

Table 2 Summary of major food groups common in most PCA-derived dietary patterns across the world

“Healthy” Dietary Pattern

Nearly all (10 out of 12) case-control studies that derived the healthy pattern reported an inverse association with colorectal cancer. Odds ratios ranged from 45 to 84% lower odds of colorectal cancer. Most of these studies did not conduct separate analyses by anatomic location of colorectal cancer. The two case-control studies that did not find an association were a hospital-based study by Tayyem et al. in Jordan (280 CRC cases and 281 matched controls; OR, 0.93; 95%CI; 0.56, 1.53) [65] and a community-based study by Kurotani et al. in Japan (800 CRC cases and 775 matched controls; OR, 0.79; 95% CI; 0.58, 1.08) [46] Supplemental Table 2.

Of the ten prospective studies that used PCA-derived dietary patterns, four reported inverse associations between higher intake of the healthy pattern and colorectal cancer risk [55, 6668]; ranging from 14 to 24% lower risk. In studies that stratified analyses by sex, the inverse association was stronger in men than in women. Nearly all the prospective studies that reported null findings had a small number of colorectal cancer cases (ranging from 172 to 460) [60, 62, 65, 69]. However, two fairly large studies in Europe and Singapore also reported null findings [59, 70] Supplemental Table 2.

“Unhealthy” Dietary Pattern

The unhealthy dietary pattern was characterized by high intake of red and processed meat, sugar-sweetened beverages, refined grains and desserts, and potatoes (Table 2). Overall, results showed that higher intake of the unhealthy dietary pattern was associated with significantly higher risk of colorectal cancer. The trend of results for the unhealthy pattern was similar (though opposite) to that described for the healthy pattern. However, there were differences between prospective and case-control studies.

Eleven of the 13 case-control studies that derived an unhealthy pattern reported positive associations with colorectal cancer, with odds ratios ranging from 46 to 162% higher odds. Satia et al. included 636 colorectal cancer cases and 1042 matched controls but found no association in both European-Americans and African-Americans [71]. Also, Kurotani et al. found no association between the unhealthy pattern and colorectal cancer (800 cases and 775 matched controls; OR, 0.99; 95% CI; 0.73, 1.34) [63] Supplemental Table 2.

Of the 11 prospective cohort studies that derived an unhealthy pattern, six reported positive associations with colorectal cancer risk. Three of the six studies reported significant HRs ranging from 31 to 48% higher risk [55, 59, 66], while the other three reported HRs ranging from 1.27 to 1.46 that did not attain statistical significance [62, 69, 72]. Five prospective cohort studies found no association between the unhealthy dietary pattern and colorectal cancer risk [60, 68, 70, 73] Supplemental Table 2.

Dietary Patterns Labeled with Regional (or Country-Specific) Names

Several regional dietary patterns were also identified in many studies. The traditional Japanese pattern was comprised of (pickled) vegetables, soy and soy products, fish, roe, rice, miso soup, seaweeds, and green tea [62, 68]; whereas the traditional Korean pattern was high in vegetables, tubers, fish, seaweeds, mushrooms, soy and soy products, and seasonings [61]. Therefore, there are many foods common to both dietary patterns and both patterns are similar to the fruit and vegetable pattern. Indeed, the fruit and vegetables pattern in the study by Park et al. was the traditional Korean pattern. Unlike its Japanese counterpart in two studies, the traditional Korean pattern was significantly inversely associated with a 65% lower odds of colorectal cancer in a case-control study (OR, 0.35; 95% CI; 0.27, 1.46) [61], whereas the traditional Japanese pattern was not associated with risk in two prospective studies [62, 68].

In a small Argentinian case-control study (41 cases and 95 matched controls), Pou et al. identified a Southern Cone pattern comprised of high intake of red meat, wine, and starchy vegetables [57]. This pattern was associated with a 48% higher odds of colorectal cancer [57]. In another small case-control study in Iran, Azizi et al. identified an Iranian pattern high in refined grains (particularly rice and flat bread), fried chicken, red and processed meat, black tea, and carbonated beverages. This pattern was associated with a 46% higher odds of colorectal cancer [64]. In summary, the patterns labeled with regional names were similar (in food components and in their association with colorectal cancer development) to one of the two global dietary patterns: the “healthy” and the “unhealthy” dietary patterns.

Dietary Patterns Derived Using the Cluster Analysis Approach

Two studies derived dietary patterns using the cluster analysis approach: a case-control study in France [74] and a prospective cohort study in the U.S. [75]. Five clusters were derived in the French study. Cluster 1 participants (the reference cluster) had low intake of eggs, bread, starchy foods, wine, processed meat, pork, lamb, beef, and high intake of coffee. Cluster 2 participants had high intakes of these foods. When the other clusters were compared to cluster 1, only cluster 2 showed an association. Cluster 2 participants had a nonsignificant 50% higher risk of colorectal cancer (OR, 1.5; 95% CI; 0.9, 2.5) [74].

Using data from the NIH-AARP cohort, Wirfalt et al. derived four clusters in men and three in women. Comparing the fruit and vegetables cluster to the “many foods” cluster, they found a significant 15% lower risk of colorectal cancer in men (OR, 0.85; 95% CI; 0.76, 0.94) and a nonsignificant 10% lower risk in women (OR, 0.90; 95% CI; 0.77, 1.06) [75] (Supplemental Table 2). Results from these two studies align with the results for the global “healthy” pattern and the “unhealthy” pattern identified using factor analysis (PCA).

Summary of Findings from Empirically Derived or A Posteriori Dietary Patterns

Two major dietary patterns–a healthy and an unhealthy pattern–emerged from the PCA method used in 23 of the 25 studies that derived patterns empirically in five different world regions (Table 2). Similar patterns emerged despite variability introduced by regional differences in types and availability of foods and differences related to several important but arbitrary decisions that researchers make, including the consolidation of food items into food groups, the number of factors to extract, and even the labeling of the components. Findings indicated that higher intake of the healthy pattern was associated with lower risk of colorectal cancer while a higher intake of the unhealthy pattern was associated with a higher risk of colorectal cancer.

It is notable that for both dietary patterns, associations were stronger in men than in women, and more case-control studies reported significant associations (which were also of larger magnitude) than cohort studies. Interestingly, findings were consistently significant across regions irrespective of study design. Among the few studies that reported results for subsites, the results from prospective studies were inconsistent for risk of colon cancer and rectal cancer whereas results from case-control studies were more consistently significant for both colon cancer and rectal cancer risk.

Discussion

In this review, a synthesis of the food group components of both the index-based or a priori dietary patterns and the empirically derived or a posteriori dietary patterns revealed two distinct global dietary patterns associated with colorectal cancer risk: a healthy pattern, characterized by consistently high intake of fruits and vegetables and by higher intakes of one or more of the following foods: whole grains, nuts and legumes, fish and other seafood, milk and other dairy products. In contrast, the unhealthy dietary pattern was characterized by high intakes of red and processed meat, sugar-sweetened beverages, refined grains, and desserts and potatoes. These two dietary patterns were remarkably evident in both a priori and a posteriori methods of deriving dietary patterns, i.e., higher dietary quality scores or higher adherence to dietary recommendations or guidelines correlated with higher intake of the healthy pattern whereas lower adherence or lower scores was concordant with higher intake of the unhealthy dietary pattern. The healthy pattern was low in components commonly found in the unhealthy pattern and was associated with lower risk of developing colorectal cancer. Conversely, the unhealthy pattern was low in foods comprising the healthy pattern and was associated with higher risk of colorectal cancer.

These findings were consistent despite the wide variety of specific food types across different regions, an indication of the high reproducibility of these patterns which is usually a concern in dietary pattern research, especially for a posteriori patterns. However, findings differed by anatomic location of the cancer, with stronger associations for colon cancer located in the distal colon; by sex, with more consistently significant and stronger associations in men than in women; and study design, with a higher proportion of case-control studies reporting significant findings compared to prospective cohort studies.

Twenty-two (45%) of the 49 included studies conducted analysis by subsite location of the cancer, with most of the studies presenting results for overall colon and rectum. Generally, associations were stronger for colon cancer than rectal cancer. In the nine studies that reported results separately for proximal colon and distal colon, associations were generally stronger for cancers located in the distal colon than in the proximal colon. The differences in risk by anatomic subsite also seemed to differ by sex; that is, whereas the significant associations for colon cancer were observed in both men and women, the associations for rectal cancer were mostly significant in men. Therefore, sex and subsite differences are important factors to consider in the design of future studies.

Though biological mechanisms linking dietary patterns and colorectal cancer development have not been elucidated, there are several potential mechanisms that may underlie the protective association of the healthy pattern or the harmful association of the unhealthy pattern. For example, the healthy pattern contains many nutrients with beneficial effects for colorectal cancer prevention, such as dietary fiber calcium and vitamin D [4]. Other potential mechanisms include inflammation, insulin response, and the microbiota. Antioxidant-rich foods including fruits and vegetables have shown an ability to lower levels of inflammation biomarkers [76] and also prevent oxidative DNA damage [77]. One advantage of the empirical hypothesis-oriented dietary patterns is their focus on a specific biological pathway linking diet and disease outcomes. Dietary indices have been developed with inflammation [19, 26••, 78] and insulin response [27••] as the central theme in their development. Association of these indices with disease incidence indicates that inflammation or insulin response, respectively, may be mediating the development of the disease, e.g., dietary inflammatory potential has been found to be associated with colorectal cancer risk in several studies [45•, 79]. Furthermore, a high and sustained pro-inflammatory potential of the diet or a hyperinsulinemic dietary pattern may compromise the host-microbiota mutualism, favoring the proliferation of toxic bacteria that have been suggested to promote colorectal carcinogenesis [80].

It is not clear why findings were more consistently significant or stronger in men than in women. Though most risk factors for colorectal cancer are common between men and women, the pattern of risk differs. For example, higher adiposity strongly predisposes men to higher risk of rectal cancer compared to women [81,80,83]. Also, early life obesity seems to be a more important risk factor for colorectal cancer in women, whereas for men, adult weight gain rather than early life predominates [84]. This pattern may be due to differences in sex hormones, given that in men and postmenopausal women, estrogen is produced mainly in fat tissue [85]. In women, a high estrogen-to-testosterone ratio is protective against colorectal cancer risk but in men it has an adverse effect [86, 87]. Furthermore, many studies identified sex-specific dietary patterns [31, 58, 66, 88] which could indicate that there are differences by sex in the intake of some foods. Indeed, alcohol featured prominently in the patterns specific to men and is an established risk factor for colorectal cancer especially in men [4].

Findings also differed relative to study design and methodology. For both a priori and a posteriori dietary patterns, a higher proportion of case-control studies reported statistically significant findings than cohort studies. This could partly be due to differential dietary recall by case-status in case-control studies. Reverse causation is also a possible reason for the difference by study design. Nonspecific gastrointestinal symptoms in the 1 to 3 years before and after diagnosis of colorectal cancer may lead to changes in dietary intake among cases. While it is difficult to control for this bias in case-control studies, most cohort studies included a 2- to 3-year lag between dietary assessment and colorectal cancer diagnosis to limit potential reverse causation.

The number of available colorectal cancer cases or length of follow-up may partly explain the difference in findings among cohort studies. For example, Wu et al. did not find a significant association between dietary patterns and colorectal cancer risk, using 561 cases in the HPFS after 14 years of follow-up [69]. However, after 26 years of follow-up and more than twice the number of cases, Mehta et al. reported significant associations between the Western and prudent dietary patterns and colorectal cancer risk in the HPFS [55]. Though there were differences among studies in the number of covariates adjusted for in the analyses and even in the categorization of the same variables between studies, these differences were not unique to case-control or cohort studies and may therefore not explain the difference in findings by study design. However, most studies irrespective of study design, adjusted for age, sex (when appropriate), education, BMI (in main analyses or in sensitivity analyses when perceived as an intermediate), total energy intake, physical activity, smoking, family history of colorectal cancer, and alcohol intake. Most case-control studies matched cases to controls by important demographic variables such as sex and age. In addition, there was no discernable pattern in findings between hospital-based and population-based case-control studies.

This review is the first to conduct a critical synthesis of different dietary patterns across different regions of the world in relation to (colorectal) cancer risk. Other design strengths are the inclusion of more studies compared to previous reviews and meta-analyses, which provided a more diverse study population for analysis. However, potential limitations to be considered include the following: the high diversity in study population across the different world regions means different culinary preferences, e.g., high intake of tofu in Asians populations compared to other regions. However, the consistency in findings for the healthy or unhealthy patterns across different populations was reassuring. While residual confounding cannot be excluded, several studies adjusted for multiple lifestyle factors such as smoking, obesity, physical activity, aspirin/NSAIDs use among others in multivariable-adjusted models. While our findings may inform future research direction, quantitative summary estimates of the associations within subgroups, e.g., defined by sex, anatomic subsite, geographic region, age group (given recent reports of a rising incidence of colorectal cancer among people < 50 years old [89]), may provide further support to the current study findings.

Conclusion

It is notable that the number of food groups, the intake quantity, the exact type of foods in each food group differed between populations within the same region and differed even more between regions, yet the two global dietary patterns (healthy and unhealthy patterns) remained consistent across regions, an indication of the high reproducibility of the patterns derived from the empirical (data-driven) methods. Also, despite the a priori assignment of food group components in index-based patterns, the remarkable similarity in the major food groups comprising the index-based patterns and the two global dietary patterns identified from a posteriori patterns is an indication of the concordance of dietary patterns derived from these two main approaches. The consistency of results for colorectal cancer risk across different populations suggests that consuming a dietary pattern that is high in fruits and vegetables and low in meats and sweets is protective against colorectal cancer development and that consumption of such a dietary pattern is more important than the specific differences in foods available in different regions.

However, important questions remain about the biological mechanisms underlying differences in colorectal cancer risk by sex: the timing of exposure to different dietary patterns during the life course (early life versus later in life [90]) and the interaction (or joint influence) of dietary patterns and the microbiome or with other lifestyle factors such as physical activity. Further elucidating subsite differences especially in studies with large number of colorectal cancer cases for each subsite is warranted [91]. Answers to these questions could better inform the design of more effective dietary interventions for the primary preventive of colorectal cancer.