Introduction

Phytic acid is an anti-nutritional compound found in the seeds of cereals and legumes. It chelates essential minerals such as iron, zinc, magnesium, calcium and potassium as it passes through the digestive system of monogastric consumers such as humans and pigs. There is a considerable amount of evidence to support the fact that dietary phytate has a negative effect on the bioavailability of dietary minerals in humans, as the substitution of low phytic acid (LPA) grain in a maize-based diet is associated with a substantial increase in zinc (Adams et al. 2002; Hambidge et al. 2004), iron (Mendoza et al. 1998, 2001; Hurrell et al. 2003), calcium (Hambidge et al. 2005) and magnesium (Bohn et al. 2004) absorption. In animal feeding studies using low phytate maize, there was an increase of 2–5 times the amount of bioavailable phosphorus observed (Douglas et al. 2000; Li et al. 2000; Spencer et al. 2000a, b; Yan et al. 2000, 2003; Veum et al. 2001; Peter and Baker 2002) which helps reduce unwanted phosphorus in livestock manure (Ertl et al. 1998; Veum et al. 2001).

There have been temperate maize lines developed that contains lower levels of phytic acid but these are not adapted to local tropical and subtropical conditions. There is a need, therefore, to have the lpa1-1 gene introgressed into locally adapted lines to improve their nutritional benefit. The low phytic acid (lpa1-1) mutation, which shows 66% reduction in phytic acid phosphorus, was generated by ethylmethane sulphonate-induced (EMS) pollen grain mutation (Raboy 2000). The lpa1-1 gene was first developed by Raboy et al. (2000) using a synthetic population (early ACR) as the seed parent and mutagenised pollen from inbred lines A632 and Mo17. The lpa1-1 gene has been mapped onto chromosome 1 distal region and is reported to be due to a single recessive mutation (Raboy et al. 2000). Shi et al. (2007) sequenced the lpa1-1 gene (5,149 bp) and reported the lpa1-1 phenotype to be probably due to an amino acid change of alanine to valine that occurred within the myo-inositol-1-phosphate synthase. The co-dominant lpa1-1 SNP marker is able to detect homozygous dominant, homozygous recessive and heterozygous genotypes using PCR and high-resolution melt (HRM) analysis. The gene sequence used to design the primers for the SNP marker was published by Shi et al. (2007). The marker was validated by forward and reverse sequencing of parental PCR products (150 bp) and a single base change from C to T was found which confirmed the amino acid change (Naidoo 2010).

Marker-assisted selection (MAS) that entails the use of molecular markers in the selection process in plant breeding programmes has been shown to be cost-effective (Abalo et al. 2009) and is most useful for the introgression of recessive traits. SNP markers are biallelic, abundant in genomes, have relatively low mutation rates, have an even distribution in the genome, and are relatively easy to detect. Although there is a high cost in developing SNPs detection assays (Batley et al. 2003; Kim and Misra 2007), the advancement in technology leading to specific low-density SNP chips and other affordable technologies, such as HRM, means that SNPs are being used more in plant studies (Gupta et al. 2001; Rafalski 2002; Barker and Edwards 2009). HRM is the most inexpensive, simple and rapid of these technologies for detecting SNPs (Gundry et al. 2003; Wittwer et al. 2003).

In HRM analysis, the polymerase chain reaction (PCR) amplification and melt analysis can be performed in one closed tube reaction (Montgomery et al. 2007) with no post-PCR analyses required (Ririe et al. 1997; Montgomery et al. 2007). The PCR products are differentiated from each other by melting curve profiles (Ririe et al. 1997) with heterozygous and homozygous genotypes being identified (Gundry et al. 2003; Wittwer et al. 2003; Reed and Wittwer 2004; Montgomery et al. 2007). The shape and position of the melting curves are due to the GC/AT ratio, length and sequence and can be used to differentiate PCR products (Ririe et al. 1997).

The main aim of marker-assisted backcrossing (MABC) is to introgress a trait of interest into the genetic background of a recipient genotype by recurrent backcrossing and also to recover the recurrent parent genome as rapidly and completely as possible. In each backcross generation there is variation around the mean of a trait which can be exploited to gain advantage in each generation. AFLPs are a universal, multi-locus marker technique that can be applied to genomes of any source or complexity and shows high levels of polymorphism, effective multiplex ratio, marker index and genotype index. A number of studies have used AFLPs in backcross plant breeding programmes to determine the percentage of recurrent parent genome, including cotton (Zhong et al. 2002), rice (Chen et al. 2000, 2001; Zhou et al. 2003; Gopalkrishnan et al. 2008), soybean (VanToai et al. 1996) and beans (Mũnoz et al. 2004).

A backcross breeding programme was developed at the African Centre for Crop Improvement (ACCI) in South Africa to introgress the lpa1-1 gene into a wild-type tropical and subtropical adapted maize inbred line using molecular markers for both foreground and background selection. The objectives of this study were to use the lpa1-1 SNP marker as foreground selection to detect homozygous recessive and heterozygote individual genotypes from the BC2F1 population using HRM analysis, and to use AFLP markers as background selection to determine the amount of recurrent parent genome (P16) present in each BC2F1 line.

Materials and methods

Germplasm

Two inbred maize lines were used in the backcross breeding programme to introgress the lpa1-1 gene into tropical and subtropical adapted germplasm. The normal (wild-type) tropical locally adapted line P 16 and the temperate lpa1-1 source CM 32 were used in this study. The normal line was crossed with the LPA line to produce the F1 generation. The F1 was backcrossed to the recurrent parent (P 16) to produce the BC1F1 generation. The BC1F1 generation was planted in pots in the greenhouse and backcrossed to the recurrent parent, with no selection for the lpa1-1 gene, to generate the BC2F1 generation.

Crude DNA extraction

Two hundred and fifty BC2F1 progeny were grown in pots in a greenhouse and maize leaves were sampled for crude DNA extraction. A crude DNA extraction method of Edwards et al. (1991) was used to extract DNA for the lpa1-1 SNP marker screening. Leaf samples for PCR analysis were collected using the lid of a sterile 1.5 ml microcentrifuge tube to punch out a disc of leaf material into the tube. The tissue was macerated with extraction buffer (200 mM Tris–HCl pH 7.5, 250 mM NaCl, 25 mM EDTA, 0.5% (w/v) SDS) and vortexed for a few seconds. The sample was heated for 10 min at 65°C and centrifuged for 2 min at 12,000g. The supernatant was removed, an equal volume of ice-cold isopropanol added and mixed before incubation at −20°C for 30 min. The sample was centrifuged for 5 min at 4°C at 12,000g and the pellet dried before re-suspending in 20 μl of TE (100 mM Tris–HCl, 1 mM EDTA, pH 8) buffer overnight.

PCR and HRM analysis

The Rotor-Gene 6000 real-time rotary analyser (Corbett Research, Australia) was used for the PCR amplification and HRM analysis. The PCR amplifications were performed in 20 μl reaction volumes consisting of approximately 15 ng of genomic DNA template, SensiMixdT for the PCR reaction components (Celtic Diagnostics, Cape Town), 1× SYBR Green I dye and 200 nM of forward and reverse PCR primers. The sequence of the forward and reverse primers for the SNP marker is listed in Table 1. The PCR amplification was initiated with a 10 min hold at 95°C as an initial denaturation step, followed by 40 cycles of 95°C for 10 s, 55°C for 15 s and 72°C for 20 s. A negative control was added in each set of PCR reactions with no DNA included to ensure non-contamination of PCR reagents. The HRM analysis was performed automatically after the PCR and programmed to ramp temperature from 72 to 95°C, raised by 0.1 degree/step after the final extension step. The Rotor-Gene 6000 real-time rotary analyser created the melting curves and Rotor-Gene 6000 software version 1.7 (Corbett Research, Australia) was used to discriminate genotypes by difference plots.

Table 1 Sequences of adapters and primers used for the AFLP process and SNP analysis

Classification of BC2 lines

The samples were classified as heterozygous or homozygous recessive genotypes after analysis with the lpa1-1 SNP marker by the R 2 values from the difference plots. The parental line, P 16, was used as the wild-type standard genotype, CM 32 as the LPA recessive standard genotype and the F1 cross between the two parental lines as the heterozygote genotype. The R 2 values were set at a threshold of 80%. The R 2 values provide an estimate of the confidence of the marker assay. Lines were selected for fingerprinting with AFLPs to determine the amount of recurrent parent present if R 2 ≥ 80% for heterozygous and homozygous recessive genotypes. Their leaf material was sampled and ground in liquid nitrogen for high-quality genomic DNA extraction using Wizard genomic DNA purification kit (Promega, Whitehead Scientific, Cape Town, South Africa), as higher quality DNA is required for AFLP analysis. The DNA was screened on a 1% (w/v) agarose gel in 1× TAE buffer (40 mM Tris–acetate, 1 mM EDTA, pH 8.0) and visualized under UV light after staining with ethidium bromide (0.5 µg ml−1).

AFLP process

The three main steps in the AFLP process included template preparation (restriction digestion of genomic DNA and ligation of oligonucleotide adaptors), fragment amplification (preselective and selective amplification) and identification of amplified products by capillary electrophoresis. The AFLP reactions were performed using the AFLP plant kit modules according to the manufacturer’s recommendations (Applied Biosystems, South Africa).

Genomic DNA was double-digested by two restriction enzymes, EcoRI (rare cutter) and MseI (frequent cutter). Double-stranded oligonucleotide adapters, with overhangs compatible with either 5′- or 3′-end generated fragments produced during the restriction digestion, were ligated to the DNA fragments. The ligated DNA fragments were pre-amplified by PCR using primers complementary to the adapter and restriction site sequence with only one additional selective nucleotide at their 3′ end (Eco-A, Mse-C). Ten microlitres of the pre-amplification reaction was screened by electrophoresis on a 1.5% (w/v) agarose gel in 1× TAE buffer (40 mM Tris–acetate, 1 mM EDTA, pH 8.0) and visualized under UV light after staining with ethidium bromide (0.5 µg ml−1), to ensure quality and quantity of pre-amplification products. Selective amplification was performed with the use of selective primers with three nucleotides for EcoRI- and MseI-specific primers according to the cycling conditions of the manufacturer. The sequences of the AFLP primers and SNP primers are listed in Table 1. The amplified fragments of the parental profiles were initially screened on the Rotor-Gene 6000. The 10 primer combinations showing the greatest differences between the parents were used for fingerprinting the BC lines. The pre-amplification and amplification PCR reactions for the BC2 lines and parents were performed in tubes and 96-well microtitre plates (Applied Biosystems, South Africa) on GeneAmp PCR system 2700 (Applied Biosystems) according to the manufacturer’s instructions.

The amplified samples were sent to the Central Analytical Facility at the University of Stellenbosch (Stellenbosch, Cape Town, South Africa) for analysis of products on the 3100 DNA sequencer. A ROX 500 (Applied Biosystems) size standard was included in each sample for sizing of fragments amplified. The F filter set was used with a 36-cm capillary for the screening of the amplified products on the 3100 DNA sequencer.

Data analysis

There were only eight primer combinations selected for scoring due to incomplete data and poor amplification of the other two primer combinations. The eight EcoRI/MseI primer combinations selected run files were viewed using Peak Scanner software v1.0 (Applied Biosystems, freeware, http://www.appliedbiosystems.com) and raw data tables of the product sizes exported to Microsoft Excel for manual scoring. The parental amplification reactions were repeated in each sequencer run and only consistent bands were selected for scoring the individual BC2 lines. Bands were conservatively manually scored in the size range of 100–500 bp.

Similarity matrices and dendrograms were generated for each primer combination separately and the data combined for the six primer combinations. For the similarity matrices construction, bands were scored as present (1) or absent (0). The fragment data was converted to binary data by generating a binary matrix (0,1) using the AFLP banding patterns of each individual. NTSYS v2.1 software (Numerical Taxonomy and Multivariate Analysis for personal computers, Exeter Software, Setauket, NY, USA) was used to evaluate the genetic similarities between the maize BC2F1 lines and the parental lines (P 16 and CM 32). Pair-wise comparisons were made between the genotypes based on Dice similarity coefficients (Dice 1945). The resultant distance matrix data was used to construct a dendrogram using the agglomerative hierarchical un-weighted pair-group method with an arithmetic average (UPGMA) sub-program of NYSTS (Rohlf 1998).

Results and discussion

There were 250 BC2F1 plants screened with the co-dominant lpa1-1 SNP marker, of which 28 plants were selected that carried at least one copy of the lpa1-1 gene and were therefore selected for fingerprint analysis with AFLP markers. A much lower number of homozygous recessive and heterozygous plants than expected were identified (11%) using the SNP marker. This could be due to the single recessive lpa1-1 gene and also that no selection for the lpa1-1 gene was practiced in the BC1 generation before advancement to the BC2 generation. Selection in the BC1 generation would have provided a higher number of homozygous recessive and heterozygous genotypes in the BC2 generation. The lpa1-1 SNP marker was able to detect homozygous dominant (wild-type/normal), homozygous recessive (LPA) and heterozygous genotypes by difference plots using HRM analysis (Fig. 1). The SNP marker was successfully validated by forward and reverse DNA sequencing in a previous study (Naidoo 2010).

Fig. 1
figure 1

High-resolution melt difference plots obtained using a SNP assay for maize lpa1-1 gene in P 16 (recurrent parent), heterozygote (F1) and three of the selected BC2F1 lines (260, 257, 246) with CM 32 (LPA donor) normalized

In Fig. 1, the difference plot graph with LPA normalized and the normal parental inbred line (P 16), heterozygote and three BC2F1 lines is more informative than the melt profiles, as the shape of the BC2F1 lines are more clearly defined with BC2F1 line 260 very closely resembling the CM 32 baseline, and BC2F1 lines 246 and 257 resembling the heterozygote curve shape. There were 11 homozygous recessive (LPA) and 17 heterozygotes identified with the lpa1-1 SNP marker according to their R 2 values and curve shape in difference plots. The heterozygotes showed a higher range of R 2 values of the lpa1-1 SNP marker, from 90.95 to 99.59%, than the LPA recessives, which ranged from 82.81 to 99.58% (Table 2). The HRM technology has been successfully used in other studies to detect homozygous and heterozygous genotypes based on melt curves and difference plots (Gundry et al. 2003; Wittwer et al. 2003; Reed and Wittwer 2004; Montgomery et al. 2007).

Table 2 R 2 values of SNP marker analysis and similarity percentage to recurrent parent (P 16) for selected 28 BC2F1 individuals with six EcoRI/MseI primer combinations of AFLP analysis

AFLP markers were chosen for this study over other types of molecular markers to determine the amount of recurrent parent genome in BC2 lines due to financial constraints and availability of primers and enzymes. In the initial AFLP screening of primer combinations on the two parental lines of the backcross breeding programme, there were ten EcoRI/MseI primer combinations screened. Only eight EcoRI/MseI primer combinations were selected for the 28 BC2F1 lines and the two parental lines of the backcross breeding programme. Due to incomplete data and poor amplification, two of the primer combinations were not included in the combined data set used for final analysis. The two primer combinations that were excluded from the combined data also showed very high similarity percentage values between P 16 and CM 32 (Eco-AAG Mse-CAT and Eco-ACC Mse-CAC) and lower similarity vales for the BC lines than the donor parent. Each individual primer combination was analysed separately and similarity values obtained.

For the individual primer combinations, similarity values were obtained for each of the 28 BC2F1 lines and CM 32 (Table 3). The primer combination Eco-ACA Mse-CAT showed BC line 75 as 0% similarity to the recurrent parent due to absence of similar bands between P 16 and 75. The amplification reaction of BC line 75 was repeated with the same results. It could be explained by the primer combination targeting an area that was dissimilar to the recurrent parent. Also with this primer combination, there were two other BC lines that showed 100% similarity to the recurrent parent, having amplified exactly the same bands as the recurrent parent.

Table 3 Similarity percentage values to the recurrent parent (P 16) for the individual EcoRI/MseI selective nucleotide primer combinations used for AFLP analysis of the 28 BC2F1 maize lines and CM 32 (LPA donor)

For the combined data, there was a total of 277 data points/bands scored with 43 (16%) non-polymorphic and 234 (84%) polymorphic bands scored between the parental lines of the backcross breeding programme (Table 4). The number of bands scored in each primer combination ranged from 34 (Eco-ACA Mse-CAT) to 59 (Eco-AGG Mse-CAT). The primer combinations Eco-ACA Mse-CAA and Eco-ACA Mse-CAC showed the highest percentage of polymorphic bands (94%) with Eco-ACG Mse-CAA (90%) and Eco-ACG Mse-CAA (90%) and Eco-ACA Mse-CAT (88%) also showing high polymorphic rates. The primer combinations had a range of 6–28% of non-polymorphic bands and range of 72–94% of polymorphic bands. Positive polymorphic bands are defined as those present only in the recurrent parent (P 16) (coupling markers) since the objective is to determine the amount of recurrent parent genome present and whether negative polymorphic bands are present in the donor parent (CM 32) only. There was a total of 224 (81%) positive polymorphic bands in the combined data with 10 (4%) negative polymorphic bands (Table 4). All the primer combinations showed high positive polymorphic band percentages (>91%) showing that the recurrent parent was highly polymorphic and generally amplified more fragments/bands compared to the donor parent.

Table 4 Levels of polymorphism and non-polymorphism among six EcoRI/MseI selective nucleotide primer combinations used for AFLP analysis of the maize BC2F1 lines

There are numerous genetic diversity studies with AFLP markers on maize using different enzyme primer combinations showing differing polymorphic rates, average number of markers per primer combination and the range of number of markers amplified in each primer combination. The AFLP polymorphism rate differs due to the number of primer combinations used, the number of polymorphic bands produced and the genetic background and number of inbred maize lines used. Examples of results using different EcoRI/MseI primer combinations and bands produced include 284 polymorphic bands with 10 primer pairs (Hartings et al. 2008), 621 (59.8%) polymorphic bands with 16 primer combinations (Wu 2000), 209 polymorphic bands (41.8%) with six primer combinations (Ajmone-Marsan et al. 1998), 261 polymorphic bands (83%) with four primer combinations (Lübberstedt et al. 2000), 232 polymorphic bands with six primer combinations (Pejic et al. 1998) and 408 (81.7%) polymorphic bands with seven primer combinations (Legesse et al. 2007). These studies used more than two inbred lines at a time and none of these studies compared temperate × tropical cross material. This study found 84% polymorphic bands with only six primer combinations tested, which is relatively high considering only two inbred lines were compared. The type of restriction enzyme influences the polymorphism rate, as EcoRI restriction sites are randomly located over the chromosome while PstI is clustered at the methylation-specific sites, and therefore PstI primer combinations show a lower polymorphism ratio (Lübberstedt et al. 2000).

Both polymorphic and non-polymorphic bands were selected for data points to determine the degree of similarity between the parental lines. The lpa1-1 source CM 32 showed a baseline similarity of 26.33% to P 16, the recurrent parent with the combined data (unpublished data). This is to be expected as they are both maize inbred lines although from different pedigrees and adaptation (tropical and temperate).

All the BC2F1 selected lines were clearly differentiated from each other and the parental controls in the combined data dendrogram, based on the Dice coefficient of similarity (Fig. 2). The combined similarity matrix generated by NTSYS was able to successfully differentiate between all 28 BC2F1 lines and donor parental line based on the similarity values to the recurrent parent (data not shown). It is generally expected in a BC2 generation to find 87.5% similarity to the recurrent parent. However, in this study, the similarity percentage values for the heterozygotes ranged from 62.12 to 92.15%, while the LPA homozygous recessives ranged from 79.19 to 90.32% (Table 2). This validates the variation around the mean that can be exploited to increase the genetic gain in each BC generation and reduce the number of BC generations required to reach a high percentage of the recurrent parent. The BC lines were very similar to the recurrent parent and no groups were identified in the dendrogram due to the close clustering seen between the lines and parental lines.

Fig. 2
figure 2

Dendrogram of fingerprinting 28 BC2F1 maize lines and parental lines, P 16 (recurrent parent) and CM 32 (LPA donor) based on six primer combinations of EcoRI/MseI AFLP data generated using NTSYS v2.1 and Dice coefficient of similarity

For the combined data of all six primer combinations, there were 13 BC2F1 lines: seven LPA recessives [235 (90.32%), 215 (86.54%), 264 (86.13%), 117 (84.26%), 256 (83.94%), 136 (83.76%). 158 (83.37%)] and six heterozygous lines [119 (92.15%), 257 (90.65%), 258 (89.80), 246 (87.14%), 253 (85.79%), 271 (83.33%)] showing the highest similarity percentages and therefore containing the highest percentage of the recurrent parent genome. These can be selected for further improvement in the breeding programme.

According to the clustering on the dendrogram, CM 32 was closely related to three BC2F1 lines, 150 (38%), 265 (32%) and 75 (27%), therefore these lines have the lowest percentage of recurrent parent genome (P 16). Due to the high level of similarity between the BC2F1 lines, it was not possible to group the lines in the combined data dendrogram (Fig. 2). This is due to the BC2F1 generation which would have a high percentage of the recurrent parent genome (expected 87.5%) by this stage.

Studies on rice using AFLPs for background selection in the BC1F1 generation produced 364 polymorphic bands to obtain 81.4% of the recurrent parent genome (Jeung et al. 2005), while a BC3F2 population had 98.8% recurrent parent recovery with 129 polymorphic bands (Chen et al. 2001), and selection in the BC3F1 generation resulted in 84.2–100% recovery of the recurrent parent with 118 polymorphic bands (Zhou et al. 2003). In this study for introgression of the lpa1-1 gene into the recurrent parent, selection was in the BC2F1 generation with 234 polymorphic bands resulting in a maximum recovery of 92.15% of the recurrent parent genome. This was higher than the expected BC2 recovery of 87.5% in the second backcross generation.

Selection of suitable BC2F1 lines would be in the range of high R 2 SNP values with high similarity percentage values. The choice of heterozygous or homozygous recessive lines would also be part of the selection process, with homozygous recessive lines being preferred due to both alleles of the lpa1-1 gene being present and fixed. All 28 BC2F1 lines will have to undergo further testing under field conditions to determine yield potential, adaptability to local environmental conditions and disease assessment.

A promising candidate for MAS for LPA mutant detection is the use of SNP markers accompanied by a high-throughput analysis method such as HRM analysis. This will lead to an improved effective breeding programme enhanced with MAS to transfer genes of interest and accelerate the recovery of the elite parent genome in backcross breeding programmes.

Conclusions

The backcross breeding programme to introgress the lpa1-1 gene into tropical and subtropical adapted inbred line (P 16) was successful with both foreground and background selection phases with molecular markers. The foreground selection for the lpa1-1 gene was efficient and rapid with use of the lpa1-1 SNP marker with HRM analysis for detection of the different alleles of the gene. Due to the co-dominant nature of the lpa1-1 SNP, it was possible to correctly identify the homozygous dominant (wild-type), homozygous recessive (LPA) and heterozygote lines of the BC2F1 population. The marker assay with the SNP marker and HRM proved to be effective in identifying all genotypes and would be less time-consuming than the traditional approach of detecting the lpa1-1 gene. There were 250 BC2F1 lines screened for the lpa1-1 gene with 11 homozygous recessive (LPA) and 17 heterozygous lines being selected for background selection using AFLP markers. The AFLP technique used six EcoRI/MseI primer combinations to produce 84% of polymorphic bands between the parental inbred lines. The percentage of recurrent parent genome recovered ranged from 62.12 to 92.15%. There were 13 lines showing >83% of the recurrent parent genome and these are recommended for further advancement and field assessment.