Introduction

The human TP53 gene is located on the 17p13.1 and spans 16–20 k base (kb). The gene has 11 exons that codes for mRNA of 2.2–2.5 kb, translated into a protein of approximately 53 kDa of 393 amino acids [1]. Both exon-intron organization of the gene and amino acid sequence of the protein are conserved across species [2]. TP53 is a DNA-binding protein with transcription regulatory activities, which comprises three domains: the amino N-terminal domain containing the activation domain and proline rich domain, the central core containing sequences-specific DNA-binding domain, and multifunctional carboxy-terminal domain [3].

Majority of both sporadic and germline TP53 mutations found in cancer cells are missense mutations occurring mainly in the DNA-binding domain of the protein [4]. Most of these mutations destroy the ability of the protein to bind to its target DNA sequences and thus prevent transcriptional activation of these genes; 98 % of the mutations reported so far in the malignant tumors are clustered between exons 5 and 8 [5]. Extensive data from in vitro studies suggested that most of the missense mutations in TP53 can inhibit the function of the wild-type protein in a dominant negative manner, which would indicate that a heterozygous mutation in TP53 could result in functional inactivation of cellular TP53 to regulate downstream target gene. P53 protein binds the DNA as tetrameric protein complex and mutated protein within this complex is possibly abolishing the DNA-binding capacity of the entire complex. Experiments with ectopic expression of wild type and mutant TP53 protein have demonstrated inhibition of DNA-binding activity and transactivation of target genes [6]. However, conflicting data on this point and general concern about the effects of ectopic expression on the result exist [7]. Despite possible dominant negative function of missense TP53 mutants, approximately 50 % of human tumors harbor such mutations; the remaining wild-type allele is mutated or lost, suggesting that complete loss of normal TP53 can promote tumorigenesis further [8]. The germline mutations of TP53 found in the Li–Fraumeni familial cancer syndrome, which include breast cancer, have a mutational spectrum with preponderances of transitions and few transversions; 44 % of the germline mutations are at CpG dinucleotides, a change that is frequent in mammalian genomes and may be attributed to spontaneous deamination of cytosine [9].

In view of the above background, exon 5–9 and intervening introns 5, 7–9 of the TP53 gene were sequenced in breast cancer cases and matched controls to evaluate the mutational spectrum in the Indian populations and its involvement in the development of breast cancer.

Materials and methods

A group of 182 breast cancer patients including 9 male breast cancer cases and 186 samples of unrelated, healthy, age- and sex-matched controls without family history of breast cancer or any other cancers were recruited for the study. Primary breast cancer cases were randomly selected from the group of patient attending Nizam’s Institute of Medical Sciences (NIMS) after confirmed diagnosis. The diagnosis of breast cancer was established by pathological examination, mammography, fine-needle aspiration (FNAC), and biopsy at NIMS. Epidemiological history such as age at onset of breast cancer, diet, socioeconomic status, occupation, reproductive history, family history, and consanguinity were taken through personal interview with subjects using specific pro forma. The patients were also screened for receptor status of estrogen, progesterone, and HER-2/neu. Clinical history such as size of the tumor, presence of auxiliary nodes, metastasis, stage and type of the breast cancer, chemotherapeutic drugs used, and prognosis of the disease was collected with the help of oncologist. Five milliliters of blood was collected in EDTA vacutainer from patients and controls from whom appropriate informed consent was obtained. The study was approved by the ethical committees of the Department of Genetics, Osmania University, Hyderabad and Nizam’s Institute of Medical Sciences, Hyderabad.

Molecular analysis

DNA was isolated as per Nurenberg et al. [10]. The aforesaid region of the TP53 gene was PCR amplified using the undermentioned primer sets and PCR conditions as follows:

  1. Primer set 1

    Forward 5′-TCACTTGTGCCCTGACTTTC-3′

    Reverse 5′-GGTTAAGGGTGGTTGTCAGTG-3′

  2. Primer set 2

    Forward 5′-CTGCTTGCCACAGGTCTC-3′

    Reverse 5′-GACAATGGCTCCTGGTTGTA-3′

The primers were designed using PRIMER 3 software [11] (htpp://primer3.sourceforge.net/). PCR was performed using 50 ng of DNA, 5 pmol of each primer in final reaction, 200 μM of each of four deoxyribonucleotides, and 1 U of Taq polymerase and 10× Taq buffer; the final reaction was carried out in 25 μl. PCR was performed through 35 cycles with initial denaturation at 94 °C for 2 min, denaturation at 95 °C for 1 min, annealing at 58 °C for 45 s, extension at 72 °C for 2.30 min, and final extension at 72 °C for 7 min. The amplification was checked on 2 % agarose gel and excessive primers and dNTPs from successfully amplified PCR product were removed by ExoI/SAP cleanup. DNA sequencing was done using BigDye Terminator v3.1 sequencing kit and Applied Biosystems 3730 automated DNA analyzer from Applied Biosystems. Contig assembly and sequence alignment were accomplished with SeqScape v2.5 software from Applied Biosystems. Mutations were scored relative to the reference sequence (NM_0005464) with each deviation confirmed by manual checking of electropherograms and independent reactions using both forward and reverse primers.

Structural changes and involvement of CpG site were observed using Swiss PDB and P53 knowledge base database [12]. Multiple sequence alignments were done with 33 different organisms using CLUSTAL W [13] to find whether the mutation region is in conserved region or not. Codon usage for the synonymous mutations was determined using “Codon Usage” tool available at www.bioinformatics.org/sms [14]. A codon that is used less frequently than expected will have less than 1 ratio. More frequently used codon than expected will have ratio more than 1 and no codon bias will have 1. Identification of enhancer site in the mutation was observed using RESCUE-ESE for exonic mutations [15] and ACESCAN2 [16] web server for intronic mutations. Splice site prediction for the intronic mutation was done using www.fruitfly.org/seq-tools/splice.html.

Statistical analysis

The results were analyzed using appropriate statistical tests by SPSS v11. Allele frequency, Hardy-Weinberg equilibrium (HWE), Chi-square, and odds ratio were also done to find out the significance of the mutations in disease; haplotype analysis was done for all the mutations by using Plink (http://tngu.mgh.harvared.edu/Purcell/plink/) [17]. Linkage disequilibrium (LD) was calculated by JLIN (http://www.genepi.org.aul.jlin) [18].

Results

The statistical analysis of nine mutations observed in the sequenced region of studied samples and electropherogram of the same are given in Table 1 and Fig. 1. Out of nine mutations, three are intronic mutations and six are exonic mutations (Table 2). Hardy-Weinberg equilibrium test was done for all the mutations in total samples as well as in cases and controls, respectively. Two intronic mutations (i.e., C14181T and T14201G) and one exon five mutation (i.e., G13203A) have shown deviation from HWE. The linkage disequilibrium plot of the observed mutations is presented in Fig. 2.

Table 1 Basic statistic for the observed mutations in the TP53 gene and association analysis with breast cancer
Fig. 1
figure 1

Electropherograms of the observed mutations in TP53 gene

Fig. 2
figure 2

Linkage disequilibrium (LD) plot of observed 9 mutations in TP53 gene

Intronic mutations

The mutations C14181T (rs12947788) and T14201G (rs12951053) in intron 7 are 20 base pair apart and found to be in strong LD (r 2 = 0.98.3; D′ = 1.00) as reported previously also [19, 20]. However, none of these mutations have shown significant association with breast cancer.

Table 2 Classification of the observed TP53 gene mutation as per the literature source

The haplotype analysis of intron 7 mutations (Table 3) showed elevated frequency of mutated TG haplotype in breast cancer patients (24 %) when compared to controls (19 %) but not statistically significant. When haplotypes were compared with respect to other risk factors, premenopausal breast cancer women had elevated frequency of TG (26 %) as compared to postmenopausal women (21 %). The frequency of TG haplotype was slightly elevated in estrogen receptor (25 %) and progesterone receptor (23 %) negative patients when compared to the estrogen receptor (19 %) and progesterone receptor (22 %) positive patients. The TG haplotype was also found to be elevated in advanced stage of the disease and higher body mass index of patients. The frequency of haplotype TG was found to be elevated in patients with a family history of breast cancer (29 %) when compared with nonfamilial cases (23 %). However, statistically significant difference was not observed for any of the category tested.

Table 3 Haplotype frequency of intron 7 of TP53 gene and correlation with epidemiological and clinical factors of breast cancer

The T14766C (rs1800899) mutation in intron 9 was found to be present only in three breast cancer patients and in one control. Out of these three patients, one patient was positive for intron 7 haplotype mutations. She had early onset of breast cancer and bad prognosis of the disease. The other two patients were postmenopausal and had normal prognosis. However, no statistical significant association was detected with breast cancer.

Exonic mutations

Six mutations (3 nonsynonymous and 3 synonymous) in exonic region of TP53 were observed in 14 breast cancer patients and none in control are presented in Table 2.

Nonsynonymous mutation

Mutation G13203A (rs28934578) in exon 5 leads to the substitution of arginine by histidine at codon 175 in the CpG site and it induces the formation of hydrogen bond between histidine at 175 and histidine at 193. This bonding occurs between side chain nitrogen atom of 175 histidine and main chain oxygen atom of 193 histidine (Fig. 3; panel A). The mutation was observed in three patients (one A/A homozygote and two G/A heterozygote). The patients with ‘AA’ homozygous genotype were found to be positive for family history of breast cancer, premenopausal and advance stage with bad prognosis. The mutation shows statistically significant (χ 2 = 4.044; P = 0.04) association with breast cancer.

Fig. 3
figure 3

Structural changes of wild and mutant TP53 missense mutations. a G13203A, b G13229A, and c A14456G

A novel sporadic heterozygote mutation G13229A in exon 5 results in substitution of aspartic acid with aspergine at codon 184. Aspergine at 184 forms two side chains. Nitrogen of side chain and main chain bonds with oxygen of cysteine at 182. Oxygen at another side chain and main chain bonds with nitrogen of arginine 196 side chain (Fig. 3; panel B). The mutation was observed in only one premenopausal patient.

Another novel sporadic heterozygote mutation A14456G in exon 8 causing aspergine to aspartic acid substitution at codon 263 formed two side chains. Side chain of carboxylic acid group bonds with main chain nitrogen and oxygen of leucine at 264. Main chain oxygen of 263 that bonds with aspartic acid at 259 position (Fig. 3; panel C) was observed in one premenopausal patient.

Synonymous mutations

Three novel synonymous mutations observed in this study are in the codon encoding amino acid proline.

In one postmenopausal breast cancer patient, sporadic heterozygous mutation C13138T was observed in exon 5 at codon position ‘153.’ The mutation is present in CpG site wherein codon ‘CCC’ was converted to ‘CCT.’ However, not much difference was observed in codon usage value for both the codons (i.e., CCC = 0.32 and CCT = 0.38).

Another sporadic heterozygote mutation G13426A at CpG site in exon 6 codon position ‘222’ was observed in one postmenopausal breast cancer patient. The mutation resulting in ‘CCG’ to ‘CCA’ conversion has created an enhancer site. The codon usage for wild type was 0.08 whereas for mutant it is 0.22.

Mutation A14572G in exon 8 at codon 301 (C-terminal region) was observed in seven breast cancer patients in heterozygous condition. Two of them had early age onset of breast cancer. The mutant codon CCG has considerably low codon usage value than the wild-type CCA (i.e., 0.08; 0.22, respectively). The mutation A14572G was found to be significantly associated with breast cancer (χ 2 = 7.105; P = 0.007).

Of the six exonic mutations observed in this study, five are in DNA-binding domain of the TP53 gene. To evaluate their effect, Chi-square test was done, considering any mutation in the DNA-binding domain may alter the gene function. Significant (P = 0.003) variation was observed in the occurrence of mutations in patients and controls, which indicates a mutation in DNA-binding domain of the gene may influence cancer development.

Discussion

Somatic TP53 mutations are the most common genetic alterations in human cancer [5]. Although these mutations are found scattered throughout the gene, the majority of mutations are confined to a 200-amino acid span within the 4 conserved core domains and result in decreased DNA-binding affinity and decreased gene transactivation [2124]. It is hypothesized that TP53 mutations may precede the development of tumors with fully malignant and invasive phenotypes [25]. Therefore, mutant TP53 has been suggested to be a biomarker predicting the risk for subsequent breast carcinogenesis [2628].

Germline TP53 mutation also serves as a risk factor for breast carcinoma development as part of the Li–Fraumeni syndrome. Although quite rare, Li–Fraumeni is a dominant inherited cancer syndrome that manifests itself with a high rate of early-onset breast carcinoma as well as multiple other tumor types [29]. TP53 mutations have been identified in nearly 60 % of families with this disease, suggesting that the loss of TP53 may be a critical parameter in the development of multiple carcinomas. Fibroblasts isolated from patients with Li–Fraumeni syndrome have not been reported to exhibit permanent G1 or G2 cell cycle arrest, suggesting that a loss of TP53 results in the loss of cell cycle checkpoint control, which may be responsible for the increased cellular proliferation. The latest TP53 mutation database of the International Agency for Research on Cancer (IARC) contains 17,689 somatic mutations and 225 germline mutations. Among these, 97 % of TP53 mutations are clustered in the core DNA-binding domain and >75 % of the mutations are missense mutations [30].

In our sequenced exon 5–9 and intervening introns 5,7–9 covering the DNA-binding domain and part of the C-terminal domain of the TP53 gene, we have found three intronic mutations and six exonic mutations. None of the intronic mutations observed in this study (i.e., C14181T, T14201G, and T14766C) has shown significant association with the breast cancer which is in accordance with other studies [19, 20]. Two linked SNPs in intron 7 (C14181T, T14201G) were reported to be associated with increased risk of invasive breast cancer [31].

Exonic mutations

Out of six coding region mutations in our data, five are in the DNA-binding domain (102–292) and one in C-terminal domain.

Nonsynonymous mutation

Mutation G13203A (rs28934578) that leads to the substitution of arginine by histidine at codon 175 has been reported as hotspot mutation previously [5, 32]. The resulting structural change in the TP53 protein may have a compromised functionality leading to breast cancer as shown by the association of minor allele with the disease in various studies [5, 32].

The G13229A mutation results in the substitution of aspartic acid with aspergine at codon 184. This novel mutation may be causal to breast cancer was observed in one patient of this study.

The mutation A14456G (N263D) was observed in one premenopausal patient. Aspartic acid at 263 formed two side chains.

Of the three missense mutations observed in DNA-binding domain of TP53 gene, two G13203A and G13229A belongs to confirmatory class (the p53 website). The mutations were found in five breast cancer patients, out of which four were premenopausal cases indicating that these mutations in DNA-binding domain of TP53 may confer stronger risk to early onset breast cancer.

Synonymous mutations

Although these sites were initially thought to evolve neutrally, there has been an increasing amount of evidence that there is selection on such sites through a variety of mechanisms, such as mRNA stability and codon usage bias. In mammals, many studies have indicated the role of mRNA stability and alternative splicing on synonymous sites [33]. There are a number of specific models (such as functional loss, translation efficiency, and translational robustness) that describe an association between gene expression and evolutionary rates and some of these models may be useful when interpreting selection at synonymous sites [34].

The three novel synonymous mutations (C13138T, G13426A, and A14572G) were observed in this study. The two (C13138T and G13426A) are of DNA-binding domain but found only in one patient each. The third (A14572G) is in the C-terminal domain with a minor allele frequency of 1.9 % in affected show a strong association with the disease. The probable reason could be related to the codon usage bias of the mutated codon.

Conclusions

This study has found five novel germline mutations in the TP53 gene out of which only one mutation may confer significant risk to the disease. All the germline mutations in DNA-binding domain of TP53 gene increase the risk of breast cancer and plausibly the nonsynonymous mutation in this region may also play a role in the early onset and bad prognosis of the disease. The linked intron 7 mutations reported previously and in this study may be modulating risk to the disease in the presence of other risk factors, but we have not obtained significant association with the disease in our study. The novel synonymous mutation observed in the C-terminal domain of the gene confers risk to the development of breast cancer. The population-based studies of germline mutations in DNA-binding domain of this gene may help in the identification of individuals and families at risk of developing cancers.