Introduction

Many thousands of different flavonoids and phenolic acids are found in plants, with major dietary sources including fruit, vegetables, tea, chocolate, and soy. Flavonoids are classified in the main sub-groups of flavonols, flavones, flavanols, flavanones, isoflavones, lignans, anthocyanins and proanthocyanins; phenolic acids are attributed to two main subgroups, the hydroxybenzoic and the hydroxycinnamic acids [21]. Some polyphenols are found in a rather wide variety of foods (such as kaempferol occuring in many vegetables), whereas others are limited to only a few kinds of food (such as apigenin found in parsley and celery).

Polyphenols have long been suggested to be beneficial for maintaining health. Meanwhile, experimental studies confirmed several biological effects. Amongst their health-promoting properties are antioxidant, antiviral, anti-inflammatory, and anti-cancer activities. Such chemopreventive agents can be effective at different stages of the carcinogenic process, both by blocking initiation and by suppressing the later stages involving promotion, progression, angiogenesis, invasion and metastasis. Several reviews have summarized the potential chemopreventive mechanisms for a number of polyphenolics [12, 23, 35, 42].

Based on the available experimental evidence, the conduct of observational studies (case–control studies and cohort studies) in humans is the next logical and necessary step towards confirmation of biological effects in vivo. For the ultimate proof of effect, intervention studies are necessary. Observational studies trying to explore the association between dietary factors and disease risk often assess the habitual dietary intake by means of food frequency questionnaires (FFQ). These measurements, however, suffer from imprecision [19], even more so when it comes to dietary components which are provided by only few kinds of foods that may be consumed only occasionally or only as ingredients (e.g. herbs). The problem is potentiated if bioavailability is moderate to low, as shown for various secondary plant products, or if metabolic activity of the intestinal microflora is required to form bioavailable metabolites (e.g. plant lignans). As a further limitation, estimates of dietary intake are hampered by incomplete or missing data in food composition tables.

The use of biomarkers of intake of such compounds should overcome some of these methodological problems in nutritional epidemiology. Analytical data obtained from measurements in human specimens are more precise, at least for the time point when the biological material was collected, and the measurement error is independent of that contained in questionnaire-based dietary intake data [16, 19]. It is obvious that the validity and reproducibility of analytical assays for measurement of nutritional biomarker needs to be demonstrated in advance of their use. However, the concept of epidemiological validity includes also the investigation of the stability of the marker, e.g. during storage of the biological specimens, and the intra- and inter-subject variation [38].

Background: overview on bioavailability and metabolism of polyphenols

Polyphenols markedly differ from one another in their bioavailability and intestinal metabolism. Current evidence from bioavailability studies suggests that the bioavailability varies from about 0.3 to 43% (based on urinary excretion) of the dose administered (Table 1), reaching plasma concentrations of 0.02–4.0 µmol/l at an intake level of 50 mg aglycone equivalents [22]. Bioavailability is determined by different factors, including the sugar moiety of the compound and its further metabolism by the gut microflora [21]. Isoflavones and gallic acid are polyphenols that are absorbed to the highest extent, followed by the catechins, flavanones and quercetin glucosides, although the kinetics described differs considerably. The proanthocyanidins, the galloylated tea catechins and the anthocyanins are much less absorbed. So far, data on the bioavailability of phenolic acids are limited. For cinnamic acid, a Na+-dependent carrier-mediated transport process has been described [42]. In contrast to the initial assumption of passive diffusion of the aglycone as the major transport form, carrier-mediated transport processes may be common in the absorption of polyphenols [42]. The most prominent example is the active transport of quercetin glucoside.

Table 1 Pharmacokinetic data from 97 studies concerning the bioavailability of polyphenols (according to Manach et al. [22])

Flavonoids undergo extensive first-pass phase II metabolism in the intestinal epithelial cells and the liver, being substrates for methylation, sulfation, and glucuronidation. For many of the polyphenols, plasma half-life time is short, and baseline levels are reached within 24 h [20]. It can be assumed that a steady-state level of plasma polyphenols is only achievable through regular consumption of foods containing them. The plasma concentrations as found in free-living subjects (fasting status) are even lower as compared to the above-mentioned range measured after dietary intervention (see, for example, values in Table 2). Due to the rapid metabolism and excretion, urinary samples, ideally from 24-h sampling, could be the preferred specimen (higher absolute amounts of polyphenols after analytical extraction and enrichment) as compared to plasma samples. In epidemiological studies that included sampling of biological material for nutritional biomarker measurement in their study design, however, usually one blood sample (serum, plasma, white blood cells, red blood cells) and/or one spot urine sample, all of limited volume, were obtained at recruitment.

Table 2 Intra- and inter-individual variation in plasma concentrations of selected flavonoids and phenolic acids

Laboratory techniques for the analysis of polyphenols in biological specimens

Two reviews provide an overview of the analytical methods used to determine the structure and content of the flavonoids and phenolic acids contained in foods and in food-based matrices [24, 32]. In principle, these methods can be applied to human specimens with small modifications. Thus, only a brief overview of techniques and specific issues for the analysis of polyphenolic compounds in blood and urine is given in the following section.

Hydrolysis In foods, flavonoids are usually glycosylated and phenolic acids are ester-bound. Hydrolysis (acidic or enzymatic) is frequently used to simplify the analytical procedure, and the respective aglycones and free acids are subsequently detected and quantified. Since also in human specimens flavonoids are largely found in methylated, sulphated, and glucuronidated form, enzymatic hydrolysis by means of sulfatase and β-glucuronidase is used [1, 2, 8, 40], unless a study aims at determining the exact metabolite structures. Phenolic acids undergo further metabolism and degradation, although part of the native compounds are often excreted unmodified via urine.

Clean-up procedures For human specimens, such as plasma, serum, or urinary samples, solid phase extraction (SPE)-columns provide the most convenient solution for removing matrix compounds that would otherwise disturb the analysis [1, 2, 8]. However, some techniques, such as immunoassays, may require only minor sample preparation [40].

Separation and detection systems The two major separation techniques for the quantification of polyphenolics are HPLC and GC–MS, although the use of LC–MS/MS is becoming increasingly common [1, 8, 24, 32]. For flavonoids, HPLC is the method of choice; the coupled detection systems include diode array detectors, mass-selective detectors as well as electrochemical or fluorometric detectors. Phenolic acids are often quantified by means of GC after derivatisation. Identification of compounds by means of mass fragmentation is used as a gold standard. However, a single mass-selective detector often fails to fulfill the requirements for sensitivity. Thus, also HPLC–ESI–MS–MS systems and similarly coupled devices have been used for analysis of polyphenols [8, 15].

The availability of antibodies for isoflavones and lignans has allowed for the development of antibody-based assays with a high degree of sensitivity [40]. For the purpose of quantification, the fluorescence emitted is recorded by means of a plate reader, with the option of time-resolved measurement. To the best of our knowledge, the scientific literature contains no report on use of the metabonomics techniques to characterize clusters of polyphenolic compounds that can potentially be used as biomarkers.

Analytical validation

Analytical validity focuses on the ability of a test to measure accurately and reliably the biomarker of interest. The components of analytical validity are sensitivity and specificity, and test reliability [38].

Specificity When using mass-selective detection methods, the characteristic mass fragments are used to confirm the identity of the compounds. This differs from other methods, such as HPLC–UV/VIS, -electron capture detection (ECD), or use of fluorescence detectors in which peak identity cannot be confirmed with certainty [24, 32]. Antibody-based assays are also subject to failures in specificity due to cross-reactions with other matrix compounds. For all these methods, confirmation of their results by use of MS-based techniques is necessary. However, MS-based systems may have problems at concentrations close to the detection limit in which the characteristic mass fragments may be absent.

Sensitivity Sensitivity is a major issue with regard to analytical methods for determining polyphenols in biological specimens. Particularly in epidemiological studies, in which the available sample volumes are usually very small, the sensitivity of a method can be decisive for whether it can be used or not (along with other factors such as analytical time and costs). Detection limits very close to 1 nmol/l of polyphenols have been reported for techniques involving HPLC–ECD, HPLC–MS–MS, LC–MS/MS, and TR-FIA (immunoassay) [2, 8, 40]. The sensitivity reported for HPLC–MS, LC–MS, GC–MS and HPLC with use of fluorometric detection is slightly lower [1, 24, 32].

Laboratory validity and reliability For each well-developed method, satisfactory figures concerning the analytical precision and accuracy are available. It is often described as coefficient of variation (CV) between repeated measurements of the same samples. Deviation from the “true” concentration is indicated as percentage recovery of added standard substances to a sample (“spiked” sample). The best results are obtained with MS-based techniques (CV <5–7%, recovery 90–105%) [1, 8, 15]; the immunoassays are located at the lower end of the scale with coefficients of variation close to or above 10% and recovery rates frequently at <90% [40]. However, working at the detection limit of a method is always a challenge, and data on the validity of the methods are usually obtained clearly above the detection limit.

Epidemiological validation

Epidemiologic validation aims at characterizing the variability of a certain biomarker within the population. The main components of biomarker variability relevant for the conduct and interpretation of epidemiological studies are the biological variability both within a subject (intra-subject variation) and between subjects (inter-subject variation), variability due to measurement error (see above), and variability due to random error [38, 39]. Repeat samples from the study subjects over a longer time period are necessary (weeks, up to years) to describe intra- and inter-subject variation. In addition, potentially relevant information on the study participants that may influence inter-subject variation, and information on conditions under which samples have been collected, stored, and analyzed (e.g. fasting status, season of blood sampling, batch of the assay, differential handling of specimens from cases and control, knowledge about the case–control status, etc.) should be collected [33]. Such information can possibly be used for adjustment or stratification in the statistical analysis in order to minimize measurement error. When studying the relationship of a certain biomarker with a health outcome, intra-group variability (e.g. within cases or within controls) should be minimized to allow for the identification of inter-group differences (e.g. difference between cases and controls), if they exist [38]. For illustration purposes, an example based on own laboratory data is given below.

Example: intra- and inter-subject variation in plasma concentrations of selected polyphenolic compounds

Enterolactone

Plasma enterolactone concentrations were measured by means of a time-resolved fluorescence immunoassay. Concerning the analytical validity, intra- and inter-assay coefficients of variation ranged between 3.1–6.1 and 6.1–8.6%, respectively. Using two plasma samples that were analyzed in each batch for quality control, a CV <10% was obtained [29]. For the purpose of assessing intra- and inter-subject variability, plasma enterolactone concentrations were analyzed in samples obtained in a small intervention study administering flaxseed over seven consecutive days (for further details see [7]). Three fasting blood samples were taken 1 week before starting the intervention, on the day when the intervention started and 2 weeks after the intervention phase. The average inter-individual CV in fasting plasma enterolactone concentrations was 97%. The CVs describing the variation within subjects are listed in Table 2. With exception of subject B, CV was below 50%, in three subjects even below 10%. This data indicates that a kind of steady-state plasma concentration can be reached that describes a subjects’ exposure to enterolignans. However, it is well described that the use of antibiotics drastically diminishes enterolactone concentrations over a longer time period [17]; this additional information has to be obtained from the subjects in order to be considered in the statistical analysis.

Other flavonoids and phenolic acids

The above-mentioned study also measured plasma concentrations of a series of flavonoids and phenolic acids. A detailed description of the HPLC–ECD based method (including enzymatic hydrolysis) and the results of the analytic validity are given elsewhere [2]. Briefly, recovery of added standard compounds varied between 81 and 106%. Intra- and inter-assay coefficients of variation ranged between 1.0–6.5 and 1.5–9.6%, respectively.

When applying this method to three fasting blood samples per subjects, the inter-individual coefficients of variation (CV) ranged between 52% (quercetin) and 265% (p-coumaric acid). The CVs describing the within-subject variation are listed in Table 2. For each compound, mean intra-subject variation was distinctly below the CV for inter-subject variation. Intra-subject variation was lowest for the flavonols quercetin and kaempferol, followed by isorhamnetin, the flavone luteolin, and the hydroxycinnamic acids ferulic acid and caffeic acid. An intra-subject CV above 100% was observed for the flavanones (hesperetin), some hydroxybenzoeic acids (vanillic and salicylic acid), and p-coumaric acid. Thus, several compounds are possible candidates for biomarker use. In the case of flavonoids and phenolic acids, there are also good examples of further factors that affect validity of the measurements and should be carefully considered in the study design and statistical evaluation. Season is expected to distinctly affect the intake of certain foods rich in specific phenolic compounds; thus, the extent of variation of biomarker concentrations by season needs to be investigated, and season of blood (or urine) collection should be considered in the statistical analysis. In addition, many polyphenolic compounds are subject to oxidative damage during sample handling and maybe also sample storage. Addition of antioxidants is recommended to increase stability of polyphenols during sample preparation and extraction [2].

Relationship of biomarkers to dietary intake

Integration of biomarker measurements in bioavailability studies and other short-term interventional studies

Several short-term intervention studies demonstrated the applicability of biomarkers of polyphenolic compounds and confirmed the direct relationship between dietary intake and biomarker concentrations. A recent report summarized all scientific studies (n = 97) that have been conducted so far on the bioavailability of polyphenols [22]. Most of the studies concerned only one or few compounds within a given subclass of polyphenols. Kinetic data from this report are summarized in Table 1 (according to ref. [22]). Relatively low plasma concentrations were obtained in each case, even after the administration of polyphenol preparations or polyphenol-rich food corresponding to 50 mg aglycone equivalents. The large differences in bioavailability between the various compounds and classes of polyphenols are striking.

Also, a review of short-term intervention studies (n = 93) in which polyphenols were administered (either as isolated compounds or in the form of foods or food extracts) to human subjects was published recently [41]. The studies included in this review examined the biological effects of polyphenols. Measurement of biomarkers of dietary intake was used to describe the internal dose of the compounds under investigation and partly also to control for adherence to the study protocol.

Integration of biomarker measurements in cross-sectional studies

Various studies have investigated the suitability of fasting plasma or urinary concentrations of polyphenols (mainly flavonols, flavanones or isoflavones) as biomarkers of polyphenol intake [2, 4, 6, 26, 27, 30, 31, 37]. Usually, plasma samples were taken after overnight fast and urine was collected as spot urine or over 24 h. The results of these studies suggest that the biomarker concentrations reasonably reflected short-term intake of the polyphenols under investigation, although one study failed to support this conclusion [6]. A fairly high variation in plasma polyphenol concentrations of free-living subjects following their habitual diet was described [30]. Statistically significant correlations between estimates of the dietary intake before blood sampling and fasting plasma concentrations of polyphenols (quercetin, kaempferol, naringenin, hesperetin) were reported with correlation coefficients of 0.30–0.46 and 0.42–0.64, respectively, when considering the diet over 1 week or 1 day before blood sampling [30]. However, it has to be pointed out that the validity of such correlations may be limited by the lack of precision in the estimates of dietary polyphenol intake. Due to the short half-life time of most polyphenols, steady-state plasma concentrations can only be achieved if the compounds are consumed regularly, a precondition most likely to be fulfilled by compounds such as kaempferol that are widely distributed in plant foods. Urinary excretion of polyphenols in subjects on habitual diets was also shown to be significantly correlated with estimates of short-term intake of fruits and vegetables, the correlation coefficients for selected flavonols and flavanones ranging from 0.28 to 0.38 [26]. The urinary polyphenol excretion rates show a high degree of variability just as described for the plasma concentrations. To give an example of polyphenol excretion in 24-h urine samples of subjects on a habitual diet, Nielsen and coworkers [26] reported average (SD) concentrations of quercetin, kaempferol, naringenin, phloretin, and total flavonoids as being 25 (23), 50 (32), 701 (659), 76 (110), 1638 (1316) µg/24 h, respectively.

Plasma and urinary polyphenol concentrations are not expected to reflect long-term or habitual dietary intake, although this has not been investigated extensively. One study reported correlation coefficients between 0.24 and 0.74 for plasma isoflavone concentrations and the dietary intake estimated from FFQ data [37].

Integration of biomarkers into studies of diet and cancer risk

In large-scale epidemiological (etiological) studies on disease-related effects of dietary polyphenolic compounds, little use has been made of biomarker measurements. Hertog and colleagues [10] were the first to analyze commonly consumed foods in terms of their flavonol and flavone content by means of HPLC; their work provided the basis for intake estimations of dietary flavonols and flavones. In the following years, several studies on associations with the risk of cardiovascular disease or cancer of different sites were conducted, some of which with promising results [11]. Except for studies on phytoestrogens (isoflavones and lignans), the available literature provides only few reports in which biomarker measurements of polyphenolic compounds were applied (Table 3). In all four identified studies, measurements were conducted in spot urine samples, analyzing for flavanones (citrus fruit) [3], catechins (tea polyphenols) [34, 43] or a summary measure of phenolic compounds [45]. The investigations were performed in the Shanghai Breast Cancer Study (case–control study) [3, 45], and in the prospective Shanghai Cohort Study [34, 43].

Table 3 Epidemiologic studies of the risk of cancer using biomarkers of dietary polyphenol intake (excluding isoflavones and lignans)

Biomarkers of phytoestrogen intake, however, were estimated in a series of studies using both plasma and urinary samples. Especially, the association with the risk of breast and prostate cancer was investigated. Concerning breast cancer, the identified scientific studies are summarized in Table 4, including six case–control studies and an equal number of case–control studies nested in cohort studies. The major reasons for the frequent measurement of biomarkers of phytoestrogen intake in epidemiological studies was the availability of immunoassays that are appropriate for studies with large sample numbers while the required sample volumes are small. Sophisticated and time-consuming methods often do not allow for analyzing a sufficient number of samples that would be required for ensuring sufficient statistical power. However, phytoestrogen analysis was performed by means of an isotope dilution liquid chromatography/tandem mass-spectrometry method in the work of Grace et al. [9] and Verheus et al. [36] (Table 4). Due to the concomitant measurement of several phytoestrogens in the same sample and the high analytic validity, this sophisticated method became time- and cost-efficient as compared to other available methods.

Table 4 Epidemiological studies of the risk of breast cancer risk using biomarkers of dietary isoflavone intake and/or mammalian lignans

In view of the usually rather low sample volumes available in epidemiologic studies, the analysis of polyphenols is in most cases restricted to one or few compounds. However, analytical procedures that permit a variety of polyphenols to be determined in a single run were published recently [1, 2, 8, 15].

Many of the studies listed in Tables 3 and 4 were designed as case–control studies. An effect of diagnosis and treatment of cancer on the biomarker levels seems possible and could lead to biased results. Therefore, prospective cohort studies with biological material obtained years before onset of disease are the studies of choice for the application of biomarker measurements. In almost all large cohort studies around the world, sampling of blood or urine was performed only once, in most instances at recruitment of the study participants. However, biomarker measurement in repeated samples over time would distinctly decrease measurement error in exposure data. This is especially true for dietary compounds with a possibly high variability in intake, such as several flavonoids and phenolic acids.

Conclusion

Certainly, biomarkers are promising in providing a more accurate and objective measure of dietary intake of polyphenolic compounds than estimates based on current or habitual dietary intake. However, due to the short half-life time of these substances, fairly regular dietary intake is necessary to achieve kind of steady-state levels in biological specimens. This depends upon the dietary habits of the investigated population, e.g. this shall work for isoflavone intake in Asian populations and maybe lignan or flavonol (kaempferol) intake in Western populations. For the use of biomarkers in epidemiological studies it is essential that their analytical and epidemiological validity have been investigated in depth. It should be kept in mind that in most cohort studies biological specimens are collected once, mostly at recruitment. Thus, it also appears that the biomarker approach will not provide a solution to all problems encountered with (long-term) dietary intake calculations. A combination of methods will probably be the most valuable choice.