Introduction

In 2003, the World Health Organization and Food and Agriculture Organization of the United Nations recommended a daily intake of at least 400 g of fruits and vegetables per day to prevent diet-related chronic diseases [1]. Their report stated that fruit and vegetable consumption convincingly decreases the risk of cardiovascular disease (CVD), whereas the inverse association with type 2 diabetes and cancer was deemed probable [1]. Recent epidemiological reviews observed a weaker, although clearly inverse association with the risk of stroke [2] and less evident associations with coronary heart disease [3], cancer [4], and type 2 diabetes [5].

Previous studies on the association of fruit and vegetable consumption with cause-specific mortality focused on cardiovascular [610], stroke [11], or ischemic heart disease mortality only [12, 13] and, overall, observed inverse associations. Studies that looked at cancer mortality showed inverse [1416] or no associations [17, 18]. One previous study examined the association with death from chronic obstructive pulmonary disease (COPD), and observed an inverse association for fruit consumption [19].

The European Prospective Investigation into Cancer and Nutrition (EPIC) is a cohort study, including over 500,000 participants followed since recruitment between 1992 and 2000. Previous analyses within EPIC showed a non-linear inverse association between fruit and vegetable consumption and all-cause mortality [20]. This study was aimed to identify causes of death through which this reduction in risk of death is established. By assessing which specific causes of death are (inversely) associated with fruit and vegetable consumption and examining differences between subgroups of the population, this study also aimed to hint at possible mechanisms responsible for the inverse association with mortality.

Methods

The methods of the current study are based on a previous study within EPIC [20].

Study population

European Prospective Investigation into Cancer and Nutrition is an ongoing multicenter prospective cohort study designed to investigate the relationships between diet, nutritional status, lifestyle and environmental factors and the incidence of cancer and other chronic diseases. Its rationale, study population and data collection process have been described in detail before [21, 22]. In summary, the EPIC cohort included 521,448 participants (approximately 70 % women), mostly aged between 25 and 70 years, recruited between 1992 and 2000. Participants were recruited from 23 centers in ten European countries (Denmark, France, Germany, Greece, Italy, The Netherlands, Norway, Spain, Sweden, and the United Kingdom). Most participants were recruited from the general population, except for the French (members of a teachers health insurance program), Italian (except Florence and Varese) and Spanish cohorts (mostly blood donors), the Florence (Italy) and Utrecht (The Netherlands) cohorts (women attending mammographic screening programs) and the Oxford (UK) cohort (vegetarian and health-conscious participants). In France, Naples (Italy), Norway and Utrecht (The Netherlands), only women were recruited. At recruitment, anthropometric measurements were conducted and participants were asked to complete dietary and lifestyle questionnaires. All participants gave written informed consent and the study was approved by the relevant ethics committees in participating countries and the Internal Review Board of the International Agency for Research on Cancer.

Participants with missing data on diet (n = 6,962), mortality (n = 2,712) or all confounders (n = 65) were excluded. To minimize misreporting, participants in the lowest or highest 1 % of the distribution of the ratio of reported energy intake to required energy [23], the lowest or highest 0.5 % of the distribution of BMI, or the highest 0.5 % of the distribution of fruit or vegetable consumption were also excluded (n = 19,450). Participants with a history of cancer, myocardial infarction, stroke, angina, diabetes or any combination (n = 41,108) were excluded because these are at an increased risk of death and possibly changed their diet prior to recruitment.

Dietary assessment

At baseline, the diet of participants reflecting the past 12 months was assessed by country-specific dietary questionnaires (DQ) designed to reflect local dietary patterns [22, 24]. Most DQ were self-administered. In Greece, Spain and Ragusa (Italy), a face-to-face DQ was used. A DQ was combined with a 7-day record in the UK and Malmö (Sweden) cohorts. Information on validity of the DQ has been published previously [25, 26]. A standardized nutrient database was used to estimate energy, alcohol, and nutrient intakes [27]. Information on lifestyle was also obtained using questionnaires. This study focuses on total fruit (fresh fruits as well as dried or canned fruits, excluding olives, nuts and seeds), total vegetable, and fruit and vegetable combined. Legumes, potatoes, and other tubers were not included as vegetable. Fruit and vegetable juices were excluded. Quantification of dietary consumption was performed using pictures (showing increasing amounts of preselected portions to the participant), household measures (e.g., pictures making use of glasses, bowls etc. and a ruler, which are also shown to the participant) and/or standard units (for foods consumed in small quantities). The amount consumed per item was calculated, while taking into account the method of preparation and the edible part consumed [28].

Outcome assessment

Information on vital status of participants was retrieved from population registries, boards of health and death indices in Denmark, Italy (except Naples), The Netherlands, Norway, Spain, Sweden, and United Kingdom. In France, Germany, Greece, and Naples this information was obtained by follow-up mailings and subsequent inquiries to regional registries, health departments and physicians. The end of follow-up varied between centers, ranging between 2006 and 2010. All causes of death reported in death certificates were recorded according to the tenth edition of the International Classification of Diseases (ICD) and were classified as ‘immediate’, ‘antecedent’ or ‘underlying’. If multiple causes were recorded, the cause of death that was used in this study was the underlying (if present), antecedent (if no underlying cause was given) or immediate (if no underlying or antecedent cause were given).

The ICD includes 22 different ‘chapters’, each existing of multiple ‘blocks’. These blocks each include multiple related diseases. All causes of death, be it a separate code, block or chapter, with more than 250 cases were included in the analysis. Additionally, malignant neoplasms (ICD: C00–C97) were classified as strongly related to smoking (oral cavity and pharynx [C00–C14, excluding C07 and C08 for parotis and other salivary glands], oesophagus [C15], stomach [C16], liver [C22], pancreas [C25], aerodigestive tract [C32–C34], kidney [C64], bladder [C67]), alcohol-related (oral cavity and pharynx [excluding C07, C08 and C11 for parotis, other salivary glands, and nasopharynx], oesophagus, colorectum [C18–C20], liver, larynx [C32], breast [C50]) [29], BMI-related (oesophagus, colorectum, gallbladder and biliary tract [C23–C24], pancreas, breast, uterus [C54], kidney) [29] and physical activity-related neoplasms (colorectum, post-menopausal breast, uterus) [29].

Statistical analysis

Hazard ratios (HR) with 95 % confidence intervals (95 % CI) were calculated using Cox proportional hazards models, using age as underlying time variable. Gender, center, and age at recruitment were used as stratification variables to minimize departure from proportionality (examined with log–log plots).

Consumption of fruit and vegetable was modeled using EPIC-wide quartiles and continuously using increments of 100 g/day for fruit and vegetable consumption separate and 200 g/day for the combined consumption. Tests for trend were performed using quartile medians modeled continuously. Vegetable consumption was additionally stratified by mode of preparation (raw or cooked). Preventable proportions (PP) were calculated to estimate the PP of deaths if all participants consuming less than 400 g of fruits and vegetables per day would shift their intake to at least 400 g/day [30].

Analyses were adjusted for physical activity according to the Cambridge Physical Activity Index (CPAI) (inactive, moderately inactive, moderately active, active) [31], education (no education/primary school, technical/professional school, secondary school, university), smoking status at baseline (never, former, current), red meat (g/day) and processed meat consumption (g/day). Restricted cubic splines (RCS) with four knots (at the 5th, 35th, 65th, and 95th percentile) were fitted for number of cigarettes smoked per day, lifetime duration of smoking in years, years since stopped smoking, baseline alcohol consumption (g/day), and body mass index (BMI) (kg/m2) to model non-linear relations between covariates and mortality. Because of a moderate correlation (r = 0.28), models for vegetable and fruit consumption were mutually adjusted. Missing indicator variables were used for variables of smoking (9,444 missings), education (15,586 missings) and physical activity (42,243 missings, including the entire Norway cohort) as exclusion of these participants did not materially change the results.

Consistency of associations within ‘blocks’ and ‘chapters’ as defined by the ICD and between lifestyle-related and non-lifestyle-related neoplasms was examined using a joint Cox model in an augmented dataset [32]. Different baseline hazard functions were calculated for each disease (or block) and associations were directly compared between diseases. Models with and without an interaction term for diseases and quartiles of fruit and vegetable consumption were compared using a likelihood ratio test.

Associations with mortality may differ between participants with a different a priori risk of death. Therefore, participants were cross-classified according to quartiles of fruit and vegetable consumption and categories of gender, smoking status (current, former, and never smokers), alcohol consumption (low, moderately low, moderately high, high, defined as >0–<3, 3–<12, 12–30, >30 g/day for women and >0–<6, 6–<24, 24–60, >30 g/day for men), BMI (<25, 25–30, >30 kg/m2) and physical activity (inactive, moderately inactive, moderately active, active). A likelihood ratio test was used to compare models with and without an interaction term. To ensure enough participants in the analyses, only causes of death with at least 1,000 cases were included. Differences in associations between cases with different follow-up periods were also examined, to examine the effect of different induction periods and preclude retrocausality. Models with and without an interaction term between quartiles of fruit and vegetable consumption and time-dependent covariates for tertiles of follow-up were compared using a likelihood ratio test.

To correct for over- and underestimation of dietary intakes from the DQ, continuous associations were calibrated using a fixed-effects linear calibration model [33]. In the calibration model, the 24-h dietary recall values, from a random 8 % sample of each center’s participants [34], were regressed on the DQ values. All variables included in the models for cause-specific mortality were included as covariates. Gender- and center-specific calibration models were used to obtain predicted values of consumption (calibrated values) for all participants. Models for cause-specific mortality were then applied using either the calibrated or the observed consumption values. A standard error of the calibrated coefficient was calculated using consecutive bootstrap sampling. Non-consumers were kept in the regression.

Results

The median reported consumption of fruit and vegetable combined was 388 g/day. High consumption of fruit and vegetable was associated with a high proportion of women and never smokers, low consumption of processed meat, high consumption of red meat and high energy intake (Table 1). After a follow-up of approximately 13 years, 25,682 participants (56 % women) were reported as deceased among all 451,151 participants. Out of all deaths with a reported cause (n = 20,737), major causes were neoplasms (n = 10,627) and diseases of the circulatory system (n = 5,125) (Tables 2, 3). Concordant associations were observed between models with and without energy adjustment (data not shown).

Table 1 Baseline characteristics according to quartiles of the combined consumption of fruit and vegetable
Table 2 Fruit: Hazard ratios for cause-specific mortality according to quartiles a and observed (as derived from the DQ) and calibrated continuous increase of fruit consumption
Table 3 Vegetable: Hazard ratios (HR) for cause-specific mortality according to quartiles a and observed (as derived from the DQ) and calibrated continuous increase of vegetable consumption

Combined consumption of fruit and vegetable

No heterogeneity in the association between fruit and vegetable consumption and mortality was identified within chapters or blocks, as defined by the ICD. Participants consuming more than 569 g of fruits and vegetables per day had lower risks of death from diseases of the circulatory (HR 0.85, 95 % CI 0.77–0.93), respiratory (HR 0.73, 95 % CI 0.59–0.91) and digestive system (HR 0.60, 95 % CI 0.46–0.79) when compared with participants consuming less than 249 g per day (Online Resource 1). PP for these diseases were 6.5, 7.4 and 14.9 %. Other and unknown causes of death also showed inverse associations with fruit and vegetable consumption. A positive association was observed with diseases of the nervous system (HR for highest quartile 1.54, 95 % CI 1.09–2.18, PP −6.9 %).

An inverse association with mortality from diseases of the respiratory system was only seen in women (Online Resource 2) and an inverse association with mortality from neoplasms was observed for participants with high alcohol consumption (Online Resource 3). The inverse association with mortality from ischemic heart disease was only apparent in participants with low alcohol consumption and inactive participants (Online Resource 4). No clear differences in associations were observed between cases with different smoking status, BMI or length of follow-up time (data not shown).

Fruit consumption versus vegetable consumption

An inverse association with diseases of the digestive system was seen for both fruit and vegetable consumption, but a lower risk of death from diseases of the circulatory and respiratory system was only observed for high vegetable consumption (Tables 2, 3 and Online Resources 5 & 6). In contrast, a positive association with diseases of the nervous system was only observed for fruit consumption. No clear, consistent associations were observed with lifestyle-related and non-lifestyle-related neoplasms for fruit or vegetable consumption (Tables 4, 5).

Table 4 Fruit: Hazard ratios for the mortality from lifestyle and non-lifestyle related malignant neoplasmsa, according to quartilesb and observed (as derived from the DQ) and calibrated continuous increase of consumption
Table 5 Vegetable: Hazard ratios for the mortality from lifestyle and non-lifestyle related malignant neoplasmsa, according to quartilesb and observed (as derived from the DQ) and calibrated continuous increase of consumption

When comparing associations within the chapter of neoplasms, an inverse association with malignant neoplasms of the respiratory and intrathoracic organs, and a positive association with malignant neoplasms of the male genital organs was observed for fruit consumption. Risks of death from all other blocks of neoplasms were not associated with fruit consumption (P for heterogeneity 0.03). Apart from an inverse association with urinary neoplasms, no associations with neoplasms were observed for vegetable consumption (P for heterogeneity 0.64).

When comparing raw and cooked vegetables, associations were more pronounced for consumption of raw vegetables (Online Resources 7 & 8). Risks of death from diseases of the circulatory and digestive system were inversely associated with consumption of both raw and cooked vegetables. Inverse associations with risks of death from diseases of the respiratory system, neoplasms and mental and behavioral disorders were seen for raw vegetable consumption, whereas a reduced risk of death from other causes was observed for cooked vegetable consumption.

Discussion

In this prospective study, high fruit and vegetable consumption was associated with a lower risk of death from diseases of the circulatory, respiratory and digestive systems when compared to participants with a low consumption. Furthermore, high raw vegetable consumption was also associated with a lower risk of death from neoplasms and mental and behavioral disorders compared to participants with a low consumption, whereas high fruit consumption was associated with a higher risk of death from diseases of the nervous system compared to participants with a low consumption.

To our knowledge, this is the first study to examine associations between consumption of fruits and vegetables and a wide range of causes of death. Although associations observed in the current study agree with conclusions from previous inverse associations with CVD [613, 1618] and COPD mortality [19], and a less evident association with cancer mortality [1418], previous reported associations are slightly stronger. This may be partly due to the smaller number of participants in previous studies (most included less than 60,000), resulting in a less precise risk estimation.

The major strength of the EPIC study is the large number of participants and its long follow-up, resulting in a large number of deaths. This allowed analyses to distinguish between many different causes of death and to examine whether associations were consistent between different categories based on gender or lifestyle. However, dietary and lifestyle variables were assessed only at baseline and a longer follow-up therefore also means a higher chance of changes in diet. In addition, because induction periods vary between different diseases, a single measurement may not always represent the most relevant exposure. Although associations may have been missed because of this, observed associations indicate that a single measurement sufficiently covered the relevant exposure for some major causes of death.

The large amount of information on dietary and lifestyle variables enabled correction for confounding, although residual confounding could still be present due to measurement or classification errors in included confounders or their possible insufficient coverage. Systematic over- and underestimation in the DQ could be partly corrected for by calibration using a 24-h dietary recall, available in a random sample of the cohort [33, 34]. Although associations were generally stronger after calibration, it should be noted that measurement error may still be present because the error structure in the dietary recall is not completely independent from that in the DQ [35]. Associations should be interpreted with caution, considering the number of tests that were performed.

The observed inverse association with diseases of the circulatory system (mainly ischemic heart diseases and stroke) agrees with current knowledge and has been described within EPIC before [13, 20]. Fruit and vegetable consumption is known to reduce blood pressure, an important risk factor for CVD [2, 3]. Inverse associations with other CVD risk factors (e.g., plasma lipid levels, diabetes and obesity) are not evident, but hypothesized due to the presence of fiber, folate, potassium and antioxidants [2, 3]. Stronger associations with mortality from ischemic heart disease were observed in low alcohol consumers and (moderately) inactive participants. This did not correspond with the hypothesis that participants with more oxidative stress benefit more from dietary antioxidants. We have no explanation for these observations, but residual confounding by an unknown factor or a chance finding cannot be ruled out.

The association with mortality from diseases from the respiratory system (mainly influenza, pneumonia and obstructive diseases) has rarely been studied. The inverse association seemed to exist in men only, although it is unclear why. Evidence exists for a lower risk of respiratory obstructive disease by fruit and vegetable consumption, possibly through anti-oxidative and anti-inflammatory properties of micronutrients [36, 37]. An observed inverse association with upper respiratory tract infections for fruit and vegetable consumption [38] may also hint to anti-inflammatory properties.

The most common cause of death from diseases of the digestive system in this study was alcoholic liver disease (24 %). The inverse association with fruit and vegetable consumption might be explained by the role of oxidative stress and inflammation (generally as result of malnutrition) in the pathogenesis of alcoholic liver disease [39]. Additionally, dietary antioxidants have been associated with overall improved liver health [40] and lower risks of intestinal disease [41].

The presence of dietary fiber may explain part of the inverse associations observed in this study, as supported by the similarities in associations with a previous study within EPIC that observed inverse associations between fiber from vegetables (and to a lesser extent from fruits) and circulatory, respiratory and digestive disease death [42].

A positive association for fruit and vegetable consumption was observed with diseases of the nervous system (mainly motor neuron, Alzheimer’s and Parkinson’s diseases). When comparing fruit and vegetable consumption, the association was only seen for fruit consumption. Although chronic moderate exposure to pesticides may increase the risk of Parkinson disease (but not of other neurodegenerative diseases) [43], this observation is contradictory to belief of a beneficial effect of fruit and vegetable consumption (through anti-oxidative and anti-inflammatory properties) on neurodegenerative diseases [44]. Although unlikely, participants may have changed their diets because of their disease, therefore resulting in a false-positive association. Also, a chance finding cannot be excluded.

Stronger inverse associations were observed for consumption of raw vegetables when compared with cooked vegetables, as was observed in the study on all-cause mortality [20]. Differences in availability of (anti-oxidative) micronutrients, digestive enzymes, or structure and digestibility of the vegetables may account for this observation [45]. In addition to its inverse association with raw vegetable consumption, the risk of death from neoplasms was also inversely associated with combined consumption of fruit and vegetable in participants with high alcohol consumption. This may be explained by the increase in oxidative stress that is associated with alcohol consumption [46].

Main causes of mortality from mental and behavioral disorders were dementia and disorders due to the use of alcohol. Raw vegetable consumption was also inversely associated with diseases of the digestive system (which includes alcoholic liver disease) and this association may share underlying mechanisms with the association with mortality from mental and behavioral disorders. An inverse association between fruit and vegetable consumption and dementia has also been observed previously [47], although no difference was observed between raw and cooked vegetable consumption. Inverse associations were attributed to presence of antioxidants, vitamin B, phytoestrogens and fiber.

In conclusion, this study showed that the lower risk of death associated with a higher consumption of fruits and vegetables may be derived from inverse associations with diseases of the circulatory, respiratory and digestive systems. The reduction in mortality may depend on lifestyle factors and the mode of preparation of vegetables, as associations were most pronounced for raw vegetable consumption.