1 Introduction

The molecular genetic technology allow the characterization of the most abundant genetic variation in the human genome, the single nucleotide polymorphisms (SNPs). Studies relating SNPs with human disease can help to identify precise molecular markers that can be applied for diagnostic and prognostic purposes and, possibly, novel therapeutic targets (Havill and Dyer 2005; Suh and Vijg 2005).

The polymorphism on 72 codon (Arg72Pro) (dbSNP ID: rs1042522) of TP53 gene is an important and most studied SNP that may be associated with cancer risk (De Moura Gallo et al. 2005; Olivier et al. 2010). Codon 72 belongs to the exon 4 and has both the sequence CCC, which encodes proline (Pro), a small and non-polar amino acid residue, or CGC, which encodes arginine (Arg), a large and polar amino acid residue (Wang et al. 1999; Whibley et al. 2009). Both variants had been considered wild type and result in a non-conservative change (Thomas et al. 1999; Pietsch et al. 2006). It has been reported that arginine and proline amino acids of Arg72Pro SNP are related to the occurrence of different types of cancer (Donehower 2005; Pietsch et al. 2006). The Proline variant would increased the risk for thyroid (Granja et al. 2004), gastric (Yi and Lee 2006), bladder (Pandith et al. 2010) and cervical cancer (Ye et al. 2010), whereas others studies have reported that arginine increased risk for others cancer types including breast (Kalemi et al. 2005) and colon and rectum cancer (Perez et al. 2006).

Only a few authors have studied the distribution of Arg72Pro SNP in the general population. Interestingly, a study reported the latitude dependence of arginine amino acid on Arg72Pro SNP and a tight association with cold winter temperature (Shi et al. 2009). A recent study from 52 populations worldwide found that proline amino acid was more prevalent among African and arginine among European population (Sucheston et al. 2011). An Indian study investigated the status of Arg72Pro SNP in different ethnic groups of North India. One of the groups studied (Ladakhi population), which was geographically isolated from the others groups and had oriental descent, had a frequency of Arg/Arg significantly different from all the others groups (which presented higher proline amino acid genotypes). The authors suggest that these are related to effects of racial differences and geographical isolation (Chosdol et al. 2002). A European study of a Danish cohort found higher frequencies of genotypes carrying arginine amino acid (Arg/Arg and Arg/Pro). This study also reported an important association of this SNP and increased survival after diagnosed of cancer or other life-threatening diseases as well as longevity. Moreover, the authors suggest that this SNP is an important gain-of-function genetic variant affecting the entire person (Orsted et al. 2007). By describing the genotypic and allelic distribution of the TP53, this study allows the identification of groups more susceptible to the development of chronic diseases. The present study evaluated the Arg72Pro SNP frequency of the TP53 gene in the 1982 birth cohort of Pelotas (Brazil) and its association with epidemiological parameters such as current, demographic and birth variables of this cohort.

2 Materials and methods

2.1 Study population

Pelotas is a southern Brazilian city, with a population of 350000 habitants. In 1982, all the city’s hospitals were visited on a daily basis and the hospital births identified. The 5914 live births (representing over 99% of all deliveries) whose mothers lived in the urban area of the city were examined and their mothers interviewed. The cohort has been followed up for several times and the study methodology has been previously described (Victora and Barros 2006; Barros et al. 2008). From October 2004 to August 2005, all households located in urban area of the city were visited in search of subjects belonging to the cohort. For those who had not been located and were not known to have died, the last known address and existing databases (including universities, secondary schools and telephone directories) were used for another attempt. Subjects answered a questionnaire on sociodemographic, health and behavioural variables and were invited to visit the research laboratory and give a blood sample (Victora and Barros 2006; Barros et al. 2008; Nazmi et al. 2008, 2009). The DNA was extracted as previously described (Miller et al. 1988), diluted on 3 aliquots and stored at −112°F (−80°C). These DNA samples present satisfactory quality and quantity for genetic studies. This study was approved by the Research Ethics Committee of the Federal University of Pelotas.

2.2 Genotyping

Arg72Pro SNP of TP53 gene was genotyped by PCR-RFLP using primers previously described (Lin et al. 2008). For the PCR reaction 5 ρmol of each primer, 10 ng of genomic DNA, 6 μl of GoTaq® qPCR Master Mix (Promega, USA) and H2O DNAse/RNAse free for a total reaction volume of 12 μl were used. The reaction was performed by the following conditions: 94°C for 5 min; 35 cycles of 94°C for 1 min, 57°C for 1 min and 72°C for 1 min; and 72°C for 5 min. The amplified PCR product of 199 bp was genotyping using 2 units of restriction enzyme BstUI (New England Biolabs, MA) and incubating at 60°C for 3 h. DNA fragments were electrophoresed through a 2.5% agarose gel and stained with GelRedTM (Biotium Inc., CA). The Pro72Pro genotype had one fragment with 199 bp, the heterozygous genotypes presents 3 bands with 199, 113 and 86 bp and the Arg72Arg formed 113 and 86 bp. All the assays were conducted blinded.

2.3 Sequencing

The Arg72Pro genotypes were confirmed by sequencing using both primers previously described (Lin et al. 2008). Before the sequencing step, PCR products were purified by the use of GFX PCR DNA and Gel Band purification kit according to manufacturer’s instructions (GE Healthcare, USA). The sequencing was performed in a MegaBACE 1000 DNA sequencer (GE Healthcare, USA) by the use of the Dynamic ET-terminator technology. Chromatograms were assembled and analysed using ContigExpress® module of Vector NTI 10.0 suite (Invitrogen, USA).

2.4 Statistical analyses

The chi-square (χ 2) test was used to evaluate the association between the Arg72Pro genotypes and the current and gestational variables of 1982 cohort: demographic, socioeconomic, anthropometric, physical activity, family history of cancer and related to pregnancy and birth. The skin colour variable was assessed by self-report from the individuals of 1982 birth cohort and classified as white, black or others which included mulatto, yellow, indigenous and dark. Socioeconomic status was defined following the Brazil Economic Classification Criterion, which considers points to the household characteristics, the level of education of household head and the purchasing power of individuals. A class, which is the highest, has 35 to 46 points; B class has 23 to 34 points; C class has 14 to 22; D class has 8 to 13 and E class, which is the lowest, has 0 to 7 points. The sedentary variables were defined on the 2004–2005 follow-up of the 1982 birth cohort, using the International Physical Activity Questionnaire (IPAQ) to investigate the physical activities. Those with weekly physical activity below 150 min were considered as sedentary. The variable Small for Gestational Age (SGA) follows the definition: children whose birth weight for gestational age and sex were below the 10th percentile in the reference population of Williams were considered as small for their gestational age. The Hardy–Weinberg equilibrium was also evaluated. Poisson regression was used in the multivariate analysis with the aim to eliminate the confounding factor of skin colour on all the variables. All data analyses were performed using Stata 11.0.

3 Results

In the 2004–5 follow-up visits, 4297 subjects were interviewed. In addition to the 282 known to have died, they represented a follow-up rate of 77.4%. Among the 4297 interviewed, 3914 went to the laboratory to donate a blood sample. The Arg72Pro SNP of TP53 gene was genotyped in 3794 individuals.

The genotype distributions among the 1982 birth cohort were in Hardy–Weinberg equilibrium. The table 1 shows the frequency distributions of genotypes and alleles of Arg72Pro on the 1982 birth cohort and among the skin colour variable. The genotypes showed a frequency of 46.9% Arg/Arg, 42.2% Arg/Pro and only 10.9% Pro/Pro, while the allele frequency was 0.68 of arginine and 0.32 of proline. On skin colour, black subjects were less likely to carry an arginine amino acid, whereas the proportion of homozygote for proline was higher among these subjects.

Table 1 Frequency distribution of genotypes and alleles of Arg72Pro among the skin colour of 1982 birth cohort

The table 2 shows the frequency distribution of 1982 birth cohort variables among skin colour. The schooling and socioeconomic status were associated with skin colour. The white group had a higher frequency of subjects with 12 or more years of schooling than the black or other groups (P-value of 0.000). In the same way, the white group had a higher frequency of subjects on higher socioeconomic classes than the black or other groups (P-value of 0.000). Height, body mass index (BMI) and total sedentary were also associated with skin colour (P-value of 0.019, 0.008 and 0.000, respectively). The family history of cancer showed no associations with skin colour. Among variables related to pregnancy and birth, maternal smoking in pregnancy was associated with skin colour (P-value of 0.000), and on birth weight variable, higher frequencies of the lower weight group was found in black subjects (P-value of 0.011).

Table 2 Frequency distributions of the 1982 birth cohort variables among skin colour

The analysis of Arg72Pro SNP of TP53 gene on the variables of 1982 birth cohort controlled for skin colour are presented on table 3. The main findings of the influence of this SNP on characteristics of 1982 cohort were related to history family of cancer and birth weight.

Table 3 Prevalence ratio adjusted for skin colour

In the cohort, 6.99% had a family history of cancer. It was found an association between the Arginine allele and the individuals without family history of cancer with a risk of 0.94 (95% confidence interval: 0.90–0.99) and P-value of 0.028. The group of individuals without family history of cancer had 89.4% of Arginine carriers. Also, on birth weight was found an association between subjects whose birth weight was lower than 2500 g and Arginine allele with a risk of 1.06 (95% confidence interval: 1.03–1.10) and a P-value of 0.000. The group of lower birth weight had 93.2% of Arginine carriers. Still, the genotypes analysis of this variable showed a risk of 0.95 (95% confidence interval: 0.91–1.00) and P-value of 0.052.

4 Discussion

The Arg72Pro is the most extensively studied SNP of TP53 gene and have been associated with different types of cancers (Pietsch et al. 2006), increased survival after diagnosed of cancer or other life-threatening diseases and longevity (Orsted et al. 2007), as well with human reproductive parameters (Hu 2009). In 1982 birth cohort, 6.99% of population has a first-degree family member with history of cancer, with the most prevalent cancers types being: breast, prostate, lung, stomach, cervical, brain, pharynx and leukemia. We found an association between the individuals without family history of cancer and the Arginine allele with about 89% of these individuals carrying the arginine amino acid. Dumont et al. (2003) found that Arginine variant of Arg72Pro SNP is more able at inducing apoptosis than Proline. Recently studies reported that Arginine allele carriers are associated with decreasing of esophageal cancer (Jiang et al. 2011) as well as bladder cancer risk (Jiang et al. 2010) being this a protective factor in both cases.

Concerning the skin colour, our study reveals a clear association. Although the frequency of arginine amino acid was higher among white individuals, the Proline allele, on the contrary, was prevalent among individuals with black skin colour. Studies that have been genotyping the Arg72Pro SNP among populations of diverse countries have found different frequencies of this polymorphism, showing that this SNP could be dependent of the characteristic of each population. A study discussed the genetic adaptations occurred in human life, since our ancestors left the Africa about 100000 years ago, in response to environmental adaptations. The authors affirm that when humans left the African continent and started to explore northward to high-latitude regions, the arginine amino acid of Arg72Pro SNP was selected by winter temperature and enriched in the populations living in the north (Shi and Su 2011). Recently, a study that genotyped the Arg72Pro SNP among 971 individuals from 52 populations worldwide noticed a difference in allele frequencies among different populations and also found the ancestral C allele, encoding the Pro variant, predominating among Africans, but the G variant, encoding the Arg variant, more common among Europeans (Sucheston et al. 2011). A study from north India identified higher frequency of Arg/Arg genotypes on Ladakhi population, which is geographically isolated from the other groups genotyped and have Oriental descent. On the other hand, the other Indian groups presented higher frequencies of genotypes with proline amino acid (Arg/Pro and Pro/Pro). The authors concluded that the difference in prevalence of these amino acids could be because of racial differences and geographic isolation (Chosdol et al. 2002). An interestingly European study of a Danish cohort reported higher frequencies of Arginine-carrying genotypes (Arg/Arg and Arg/Pro). This study also reported that the Arg/Pro and Pro/Pro carriers have reduced mortality, which could result from a generally decreased aging process. They concluded that this SNP leads to increased longevity and survival after diagnosis of cancer or other life-threatening diseases being this SNP an important variant affecting various human developments (Orsted et al. 2007). Other studies have also associated longevity and the Arg72Pro SNP. A study from Italy and Sardinia genotyped young and centenarians people for Arg72Pro SNP and observed that proline amino acid was slightly increased in centenarians when compared that in young people (Bonafe et al. 1999). Another study also suggests the role of this SNP on longevity, affirming that despite having an increased cancer incidence, individuals carrying Pro/Pro genotype exhibited enhanced survival compared to Arg/Arg and Arg/Pro individuals (Donehower 2005). All studies confirm the importance of epidemiological studies on a molecular level.

Another important feature relating to the p53 protein and skin colour is the melanogenesis. Studies show that p53 initiates melanogenesis activation via POMC gene (Pro-opiomelanocortin) in keratinocytes in response to UV radiation (Ultraviolet). The POMC gene product is cleaved generating α-MSH (α-melanocyte stimulating hormone), which activates the receptor MC1R (melanocortin receptor 1) in melanocytes (Cui et al. 2007). Another study indicates that the Proline allele might provide a protective effect by potently stimulating POMC expression in response to UV radiation (Miller and Tsao 2010). A recent study also demonstrated a different gene expression related to each etnia (Sharma et al. 2011).

Regarding the gestational variables, we found an association between birth weight and Arginine allele, wherein the low birth weight group had the higher frequency of Arginine allele, about 93%. Also, the genotypes showed a P-value of 0.052 in the same group. The authors affirm that the p53 protein has a role on development, on which the right balance of p53 levels is essential, and in contrast, a malfunction of this regulation may guide to embryonic lethality or malformations (Choi and Donehower 1999; Danilova et al. 2008). Also, birth weight has been associated to obesity in adulthood, which can have consequences to health problems (Leong et al 2003). So, considering the role of p53 in embryonic development and the value of birth weight to human development, analysis of a SNP in p53 gene and its role in birth weight is of importance. A very recent work has studied a cohort of pregnant women and newborns of places with different levels of air pollution. Besides the air pollutions, they also studied polymorphisms in genes related to DNA repair, oxidative stress, xenobiotic metabolism and immune functions including the Arg72Pro SNP, on which they had not found any association (Rossner et al. 2010).

In summary, we observed important associations between the Arg72Pro SNP and epidemiological variables of the 1982 birth cohort. Mainly, the skin colour was associated with genotypes and both alleles of Arg72Pro SNP and the individuals without family history of cancer were associated with arginine amino acid of the same SNP. In addition, low birth weight was related to arginine amino acid of Arg72Pro SNP.

Some of the statistically significant association could be due to type error I, introduced by multiple testing. Such error could be an explanation for the observed association between genotype and family history of cancer and birth weight. If we had corrected the P-value of these associations for multiple testing, the null hypothesis would not be rejected. We preferred not to include the correction for multiple testing for the reason that it could increase the finding of a spurious association. On the other hand, for skin colour, the association remained statistically significant even after adjustment for multiple comparisons. Furthermore, a similar association has been reported in other studies, reinforcing therefore the likelihood of the existence between genotype and skin colour.

The importance of gene–environment interactions has been emphasized in order to clarify the susceptibility to many common complex diseases (Bookman et al. 2011). Thus, data of genetic variations into epidemiologic research allows characterizing the genetic profiles of a population in association with strong epidemiological parameters that can contribute to identifications of molecular markers and the susceptibility of diseases and conditions that had been widely studied in epidemiology.