Introduction

The definition of quantitative trait loci (QTL) for livestock production traits, such as growth performance and milk production and composition are hoped to facilitate gains in selection efficiency. Meanwhile, variation of either candidate genes for production traits or linked genetic markers has informed the basic biology of milk production and composition, and encouraged the use of gene for marker assisted selection in livestock [17]. Lactose is the major carbohydrate component and the most important osmolyte in milk. It regulates the osmotic pressure and volume of the milk [26, 27, 29] and its production level in the epithelial cells of the mammary gland potentially affects milk volume and composition. Lactose biosynthesis is carried out by the heterodimeric complex lactose synthase (EC 2.4.1.44) [20, 24], which transfers a galactose moiety to free glucose to produce lactose (galactose β(1,4)-glucose) [22]. The β(1,4)-galactosyltransferase-I enzyme is a trans-Golgi resident, membrane-bound glycoprotein which is widely distributed among mammalian and non-mammalian vertebrates and even occurs in certain plant species [18]. This has been used by mammals for the tissue-specific production of lactose in the mammary gland [1]. Considering the important function of the lactose synthase complex, the gene (β4galt1 gene) encoding β(1,4)-galactosyltransferase-I, the catalytic component of the complex can be considered as a reasonable candidate gene (located on chromosome 8 in cattle) for milk production in dairy animals. However, in spite of its involvement in various physiological and biochemical reactions, apart from a recent report [23] little is known regarding the presence of polymorphisms in dairy animals, and any consequent effects which specific SNPs may have on milk production. One of the indirect methods for rapidly establishing sequence variation exploits the ability of single strand conformation polymorphism (SSCP) assay to detect small differences in sequence between short stretches of DNA, and this technique has been applied to identify variation in a number of genes affecting milk production traits in sheep [13], goat [14] and cattle [9]. The aim of this study was to search for sequence variation in the β4galt1 gene using PCR-SSCP technique followed by sequencing of the polymorphic fragments, completing preliminary work, and to determine whether any of these have a detectable effect on milk, lactose and total solid production in cows.

Materials and methods

Animals, milk production records and milk analysis

The animal sample consisted of 1,200 Holstein cows randomly selected from the first calved cows of the Ghiam Dairy Co., (Iran) dairy farm which had a total of 3,500 cows. Selected cows were fathered by 112 different commercial sperms and raised under the same nutritional, environmental and management conditions. The animals were kept in free-stall housing, and the calving season was from February to March. Prepartum cows in the transition period (3–4 weeks before parturition) were housed in a separated dry lot. All diets were based on alfalfa, corn silage, and a combination of concentrates including corn, soya meal, and bone meal. Health, fertility and production records were maintained by the dairymen and veterinarians. Official milking records covering 1,200 lactations were used to supply the monthly production of each cow up to the end of its lactation period, and these were converted to an equivalent 305 day milk production. Therefore, each 305 day milk productions was based on 9–10 subsequent records. Production records of the first and second lactations were considered in this study. For milk analysis, a milk sample was taken from each animal once per month throughout the lactation period, with the first occurring at least 15 days after parturition to exclude the risk of contamination with colostrum. As the animals were milked three times daily, at constant intervals, the influence of milking time on milk composition was avoided by combining 10 ml from each milking session within a day. The determination of milk constituents (milk lactose, protein, fat and Total Solid (TS)) was carried out with an automatic milk analyzer, (Milko Scan 4000 Foss Company, USA).

Blood sampling and DNA extraction

Peripheral whole blood was collected from jugular veins into tubes containing citrate as an anticoagulant. High molecular weight DNA was extracted by a modified salting-out method [5].

PCR amplification

Primer pairs (on line Supplementary Table S1) targeting various coding regions of the β4galt1 gene were based on the reference GenBank sequence NW_001495470 (bases 102361–155645), and were designed using the Vector NTI software v10.1 (Invitrogen). The target amplicon size was 150–300 bp, as this is optimal size range for the SSCP assay [7]. A range of annealing temperatures (52–62°C) and concentrations of MgCl2 (1–2 mM) was applied to optimize the PCR, which consisted of template DNA (50 ng), primers (16 pmol each), dNTPs (0.2 mM), 1× buffer and 1 U Taq polymerase (Roche Diagnostics GmbH) in a 25 μl reaction. The DNA was denatured at 95°C for 5 min, reactions were cycled 30 times through 95°C/30 s, annealing temperature (on line Supplementary Table S1)/30 s and 72°C/30 s, and finally incubated at 72°C for 5 min. All of the PCR products were electrophoresed at 150 V for 40 min through a 2% agarose gel containing 1× TBE buffer and 0.14 mg/ml ethidium bromide to check that amplification had been successful.

SSCP and sequencing

For the SSCP procedure, 4–5 μl of each PCR was mixed with 3 vol denaturant solution (95% v/v formamide, 10 mM NaOH, 0.05% w/v xylene cyanol and 0.05% w/v bromophenol blue), heated to 95°C for 5 min, chilled on ice, and loaded onto nondenaturing polyacrylamide gels (20 cm × 20 cm; T%: 8–12%; C%: 2.5%), containing variable concentrations of TBE (0.5× and 1×) and glycerol (0, 0.05% and 0.1% v/v) as described in Supplementary Table S1 (on line). Electrophoresis was carried out for between four and nine hours at either 25, 35, or 45 W constant power, with the temperature held at 8, 13, 15 or 20°C (on line Supplementary Table S1) on a DcodeTM Universal Mutation Detection System (Bio-Rad, Hercules CA, USA) platform. Signal detection was by silver staining (PlusOneTM DNA Silver Staining method, Amersham Bioscience, Uppsala, Sweden). Two DNA samples of each template displaying polymorphism were re-amplified, separated by agarose gel electrophoresis and the amplicons removed from the gel and purified using a PCR Purification Kit (Qiagen, Germany). The purified PCR products were sequenced by the Sanger method (Macrogen, Korea) in both directions using the appropriate PCR primers. Both the PCR and sequencing were repeated once.

Statistical analysis

The data file contained 1,200 different animals, offspring of 112 different sperms (sires) and 1,200 different dams, records of milk production traits of the first or second lactations and β4galt1 genotype of each animal. The model yijk = μ + gi + sj + (gs)ij + eijk was used in which yijk was the record of the trait (milk, lactose, fat protein and total solid production per 305 days) of cowij with genotype i fathered by j sire(sperm), μ was the population mean of the trait, gi was the effect of the genotype i, sj was the effect of the sire j, (gs)ij was the effect of the interaction of genotype i and sire j and eijk was the residual effect (error). We did not include the effect of the age at first calving in the statistical model because the ages at the first calving were almost the same with the deviation of one month.

All identified genotypes were considered in the statistical analysis except genotypes G14, G16, G17 and G18 (on line Supplementary Table S2), which were observed in only one or two cows. For each of the milk production traits (milk, lactose, protein, fat and total solid production), 1,180–1,200 valid records related to 1,200 studied animals were considered in the statistical analysis for each lactation. Statistical analysis was performed using general linear model of analyze menu of SPSS 10 statistical software. Variance components analysis was performed using Restricted Maximum Likelihood method and phenotypic, genetic, environmental and sire related variances were obtained.

The Hardy–Weinberg equilibrium of the total population for the β4galt1 genotypes was tested by χ2 analysis (Statistica software, StatSoft, Inc., Tulsa, OK, USA).

Results

The optimized SSCP-PCR conditions for each of the seven β4galt1 fragments are summarized in Supplementary Table S1 (on line). The SSCP analysis of exons I to IV and flanking regions of the β4GalT-I gene showed the following conformation patterns for the amplified fragments (Fig. 1): two patterns (A and B) in fragment containing 5′ region and exon I and its upstream region, in fragment containing exon VI and its 3′ flanking sequence, and three patterns (A, B and C) in exons II, III and V. Exon IV was monomorphic. Sequencing of the polymorphic amplicons identified nine polymorphic nucleotide sites (at the positions T224A, T29554C, G29693C, G41860A, T42030A, G52294A, G52295A, A52320C and G52906A), of which five were transitions and four transversions (on line Supplementary Table S3). The sequences of the most frequent patterns (A) corresponded to the GenBank entry NW_001495470. In the aggregate sixteen alleles were identified (on line Supplementary Table S3).

Fig. 1
figure 1

SSCP patterns of 5′ flanking region +exon I, exon II, exon III, exon IV exon V, and exon VI + 3′ flanking region of Iranian Holsteins β4galt1 gene, separated by native PAGE under non-denaturating conditions. Exons are represented by black boxes, and 5′ and 3′ flanking regions by white boxes. Nucleotide polymorphisms obtained from fragment sequencing are shown. The reference sequence corresponds to A patterns

In exon I the T224A transversion altered the second ATG codon in the exon to AAG, predicting a Met14Lys substitution in the encoded enzyme. The two substitutions in exon II involved a T29554C and a G29693C transversion, predicting, respectively, a Met174Thr and a Gln220His change. In exon III a silent G41860A transition at the Glu codon and a T42030A transversion (Phe280Tyr) were observed (on line Supplementary Table S3). In exon V, two heterozygote G → A substitutions were found. Considering the distribution of these mutations in chromosome pairs, this polymorphism reflected the existence of either a GGG/GAA or a GAG/GGA heterozygote. Each involves a change from Gly to Glu at position 340. A silent A52320C transversion (Arg349) in exon V and a silent G52906A transition (Pro389) in exon VI were further found. The SSCP results here obtained for 1,200 cows are identical to those recently published for 400 animals [23].

Altogether 18 different β4galt1 genotypes were uncovered, and these were identified by a set of figures representing each locus genotype separated by a comma. For example, an animal with genotype (1/1) at loci 1, 2, 4, 6 and 7, and (1/2) at loci 3, 5 and 8 has a genotype denominated (1/1,1/1,1/1,1/1,1/2,1/1,1/1,1/2) (G13, on line Supplementary Table S2). The mean production traits of each of the identified genotypes are reported for the first and second lactations in Supplementary Tables S4 and S5 (on line), respectively. Milk, lactose, protein and total solid productions of the first and second lactations were significantly different among cows carrying different genotypes of the β4galt1 gene (P < 0.001). Genotypes of β4galt1 significantly affected milk production traits (P < 0.001). The results of variance component analysis, fractions of the phenotypic differences among the studied cows which are due to the differences in their β4galt1 genotypes, sires and environment, are summarized in on line Supplementary Table 1. These results showed that the major component of the phenotypic variances of milk, lactose, protein and total solid production in both the first and second lactations were due to the genotypes of β4galt1 but the major component of the phenotypic variance of fat production was not due to the genotypes of β4galt1.

Table 1 Results of variance component analysis reporting fractions of the phenotypic differences among the studied cows which are due to the differences in their β4galt1 genotypes (σG 2P 2), sires (σS 2P 2), genotypes-sires (σG,S 2P 2) and environment (σE 2P 2)

Genotypes G4, G5 and G10 produced significantly lower milk, lactose, protein and total solids in the first and second lactations (P < 0.002), these were not associated with fat production, or the relative fat, lactose, protein content of the milk. Genotypes G2 and G7 produced significantly higher milk, lactose, protein and total solid (P < 0.002) and genotype G11 produced higher milk and lactose but these genotypes did not significantly affect fat production (on line Supplementary Tables S4 and S5).

It appears that a predicted Lys14 (genotype G10) negatively affects milk yield, lactose, protein and total solid productions (P < 0.001). A predicted His220 (genotypes G7, G2, G12) and Thr174 (genotype G5) positively and negatively affected milk production traits (milk, lactose, protein and TS productions), respectively (P < 0.005). Carriers of G → A silent substitution at the wobbling site of the Glu223 codon in exon III (genotype G11) had significantly higher milk, lactose, protein and total solid production in comparison with non carriers (P < 0.001). A predicted Glu340 (genotypes G4, G9) showed a negative effects on milk yield and lactose, protein and total solid productions (P < 0.002).

The total population was in Hardy–Weinberg equilibrium (χ2 = 0.954; df = 8; P > 0.05).

Discussion

The most frequent genotype in this sample of Holstein cows was G1 which completely matches the Hereford breed β4galt1 gene sequence deposited in the GenBank database. Thus, the sequence of this gene appears to be well conserved.

Variance component analysis considering restricted maximum likelihood (Table 1) showed that the major factor making differences in milk, lactose, protein and total solid productions among the studied cows is the β4galt1 genotype.

In the mouse, two distinct isoforms (“short” and “long”, which differ by an N-terminal sequence of 13 residues) of the β4galt1 gene product have been reported [25]. The transcription start sites which are responsible for these allelic transcripts have been defined [24, 25], and similar variants have been identified in the bovine [21] and human [15, 16] homologues. The amount of β4GALT1 enzyme in the lactating mammary gland increases during the lactation period to meet the demand for lactose synthesis. The most important mechanism by which this increase is ensured is the switch from the long to the short variant, which has the effect of raising the level of β4galt1 transcript [25]. Thus, allelic variation which affects the critical transcription start codon can be expected to disturb this mechanism and prevent the synthesis of higher levels of lactose. It is likely that the negative impact on milk production traits of the exon I T224A transversion (allele 2, predicting a Met14Lys substitution; genotype G10) reported in this study can be explained by this mechanism.

The bovine β4GALT1 protein consists of four domains—the cytoplasmic domain (residues 8–24), the transmembrane domain (residues 25–44), the stem region (residues 45–145) and the catalytic domain (residues 146–402) [19]. The polymorphisms in exon II (alleles 2 in each locus predicting Met174Thr- genotypes G5, G8- and Gln220His- genotypes G2, G7, G12, G14, G16- substitutions) are located in the catalytic domain, and their effect on milk production traits may be explained by a change in some of the catalytic properties of the enzyme. Confirmation of this hypothesis clearly requires experimental biochemical evidence.

The Phe280 residue, together with Tyr286, Gln288, Tyr289, Phe360 and Ile363, are involved in the interactions between β4GALT1 and α-lactalbumin to form the lactose synthase enzyme [19]. Therefore, the T42030A transversion affecting Phe280 (allele 2 predicting a by Phe280Tyr substitution, genotype G3, for instance) may alter this interaction and thereby the properties of the lactose synthase complex. Cows with genotype G3 had higher milk and total solid production in the first and second lactations however, the differences were not significant (P = 0.32, 0.21 for first and second lactations respectively).

As the polymorphisms were identified by sequencing purified PCR fragments, it was not possible to define whether the SNP in exon V is a GGG/GAA or a GAG/GGA heterozygote. Each involves a Gly340Glu change (genotypes G4, G9), and this may well alter the structural and catalytic properties of the enzyme, since Glu carries an ionisable side chain with a pKa of 4.1. The difference in milk and lactose productivity between the G4 and the G1 genotypes, and the negative effect of the former on milk, lactose, protein and TS yields could therefore be related to change(s) in the activity or other characteristics of β4GALT1 induced by this amino acid substitution. Two productive cows carry a G29623C non-synonymous substitution predicting a Gln220His change and a synonymous A52320C substitution in the Arg349 codon. This genotype G14 was not included in the statistical analysis due to its low rate of occurrence. High production of milk and lactose in the cows with this genotype may be related to the predicted Gln220His replacement because the new allele showed significant positive effects on milk production traits (in cows with genotype G2) while the synonymous substitution leaving unchanged Arg349 did not show significant effects on milk production traits (genotype G15).

Despite being synonymous the G41860A transition in exon III (Glu223 codon) appears to significantly increase the milk, lactose protein and total solid production of animals carrying the genotype G11. Although synonymous changes leave the final protein unchanged, they can affect the phenotype as a consequence of post-transcriptional events [2, 8, 28, 33].

The above discussion explains the potential and molecular basis of uncovered SNPs to affect milk production traits but we can not interpret that difference in milk production traits among cows carrying different genotypes of β4galt1 identified in this study are exclusively related to sequence polymorphism of the gene. Making such an interpretation needs more studies and observations but we found strong evidences of association of β4galt1 gene with milk production traits in this study.

The general consensus was that lactose acts primarily as an osmolyte in milk, so that the effect of increasing the quantity of lactose produced is to draw more water into the milk. Thus, the higher the synthesis of lactose, the greater will be the volume of milk. The effect of this process is to leave the total amount of other constituents such as protein and solids unchanged, so although milk yield is increased, the concentrations of total solids and protein are decreased. However, in this study, cows which produce significantly more lactose not only produce a greater volume of milk, but also yield more protein and total solids. It has been reported that the level of lactose synthesis influences milk production traits (milk volume, protein, total solid and SNF productions) in other ways than merely via its osmolytic action [23].

We suggest therefore that the β4galt1 gene, in addition to its determining the level of lactose synthesis, also directly on indirectly affects the synthesis of other milk constituents, although the mechanism by which this can operate is unclear.

Despite the fact that SSCP does not identify every sequence change, it was nevertheless able to detect variants in most of the exons of the β4galt1 gene. Our results therefore confirm that the technique represents an efficient indirect screening technique for sequence variation, as has been suggested by others in cattle [4, 6, 911, 31 ], sheep [3, 13], goat [12, 14, 32] and pig [30] sequences.

Conclusions

We have reported here different single nucleotide polymorphisms and genotypes of the bovine β4galt1 gene and evidences of the association between these genotypes and milk production traits including milk, lactose, protein and total solids production. The identified SNPs lend themselves readily for further evaluation to be considered in marker assisted selection in dairy breeding. We are currently studying the generality of these findings across other dairy species.