Introduction

Single nucleotide polymorphisms (SNPs) have been widely used in a variety of research areas, such as association studies (Ohnishi et al. 2001), biodiversity assessment (van Tienderen et al. 2002), and genetic map construction (Batley and Edwards 2007). The popularity of SNPs as valuable and efficient molecular markers has increased as these have been demonstrated to be the most abundant of all the classes of molecular markers (Gupta et al. 2001; Hayashi et al. 2004). Furthermore, recent advances in SNP detection techniques allow high-throughput assays and consequently high-density genome-wide scans (Ohnishi et al. 2001; Tsuchihashi and Dracopoli 2002).

High resolution melting (HRM) analysis has been developed to detect SNPs in small PCR amplicons because it is an easy and low-cost method (Hoffmann et al. 2007; Hung et al. 2008; Liew et al. 2004). The term ‘high resolution melting analysis’ used for the characterisation of DNA appeared in the literature as early as in 1974 (Michel et al. 1974; Steinert and Van Assel 1974). However, the analyses initially utilised large DNA fragments such as total mitochondrial DNA (Michel et al. 1974) and the kinetoplast genome (Steinert and Van Assel 1974), and the analysis was able to detect large differences only of length and CG content of the DNA. Later melting analysis of shorter PCR amplicons focused mainly on the differentiation of heterozygous mutations from the homozygous wildtype (Hiratsuka et al. 2002; Toyota et al. 2000) as well as the length and GC contents of the molecules (Ririe et al. 1997), while subtle melting difference between different homozygotes was still beyond the power of the analysis. Introduced in 2002 (Reed et al. 2007), contemporary HRM analysis discriminates two homozygous PCR amplicons containing a nucleotide variation as well as their heterozygote (Bennett et al. 2003; Herrmann et al. 2007; Stephens et al. 2008; Zhou et al. 2005). HRM has been developed to detect the single nucleotide variations of all types with most applications in the diagnostic analysis of mutated genes causing human diseases (Stephens et al. 2008; Wittwer et al. 2003; Zhou et al. 2004). The methodology has involved using fluorescence labelled or unlabelled probes that are specific to the SNP containing sequence (Bennett et al. 2003; Zhou et al. 2005). However, it has been demonstrated more recently that it is almost equally effective when only PCR amplicon melting is performed negating the need for costly probes (Hoffmann et al. 2007; Krypuy et al. 2007; Liew et al. 2004). The evolution of the HRM techniques from traditional melting curve analysis has been implemented by the invention of new generation intercalating dyes which can saturatedly bind to the double stranded DNA without PCR inhibition; and by improved optical and thermal precision that has been incorporated in the instruments (Herrmann et al. 2006b). In addition to SNP genotyping, the application of HRM analysis has also extended to the assessment of DNA methylation (White et al. 2007; Wojdacz and Dobrovic 2007) and microsatellite (Mackay et al. 2008).

Massive amounts of SNPs have been discovered and characterised in the species where whole genome sequences are available, such as human (The-International-HapMap-Consortium 2007) and Arabidopsis (Schmid et al. 2003). In almond (Prunus dulcis), six SNPs have been reported and were used for mapping candidate genes (Silva et al. 2005). While 33,189 peach and almond SNPs in 1,156 SNP reports were recorded in the ESTree database (Lazzari et al. 2005, 2008) (http://www.itb.cnr.it/estree/snps.php), only in silico SNP prediction was made without further validation details. In fact, many reported SNPs are sequencing or other systematic errors and therefore not real SNPs. Therefore, the SNPs need to be verified manually and experimentally. There are no Prunus SNPs yet registered in dbSNP database till 1 August 2008 (http://www.ncbi.nlm.nih.gov/SNP/), although some SNPs were converted from RFLP fragments in Prunus mume (Fang et al. 2005) and used for mapping candidate genes in almond (Silva et al. 2005). The aim of the present study was to apply the HRM approach to the development of SNPs with 25 important almond cultivars, using existing almond and peach ESTs, and to establish a technique for SNP discovery and genotyping in almond and other species.

Materials and methods

Plant material and DNA extraction

A group of 25 almond cultivars was used as a population for SNP validation and discovery. The origin and collection of the cultivars is shown in Table 1. The leaves of these cultivars were collected from the following sources: the Claremont and Alverstoke orchards at Waite Campus of The University of Adelaide, South Australia; Loxton research Centre, Loxton, South Australia; private orchards within South Australia and Victoria (Woolley et al. 2000). Collected leaves were stored at −80°C until required for DNA extraction.

Table 1 Source of the almond cultivars assayed using high resolution melting (HRM) analysis

DNA was extracted from leaf tissue using the method reported previously (Mekuria et al. 1999). Briefly, fresh young leaves were ground to a fine powder in liquid nitrogen and extracted with hot CTAB containing 2-mercaptoethanol and PVP-40T. Contaminating substances were removed with chloroform:isoamyl alcohol and the DNA was precipitated with cold isopropanol and washed in 76% ethanol containing 10 mmol/L CH3COONH4 until it turned white. The purified DNA was dissolved in TE buffer (10 mmol/L Tris–HCl, 0.1 mmol/L EDTA, pH 8.0) and RNA was removed by incubating the sample with DNase-free RNase A. Additional proteins, including RNase, were precipitated with NH4Ac, and the DNA was collected by precipitation with ethanol and dissolved in TE buffer. DNA samples with absorbance ratios above 1.7 (Sambrook et al. 2001) were used for the analysis in this experiment and stored at −20°C until needed.

SNP search in database

A total of 3,864 almond EST sequences were obtained from the NCBI Genebank EST database (http://www.ncbi.nlm.nih.gov/). The sequences were assembled using ContigExpress of the Vector NTI 10 software package (Invitrogen, Sydney, Australia) with a minimum overlap of 40 bases and 95% identity match. SNPs were chosen for HRM analysis when their redundancy score was two or higher (Batley et al. 2003). SNP search was also conducted in the peach and almond SNP database at ESTree (http://www.itb.cnr.it/estree) to obtain more putative SNPs and they were then used for SNP development in almond.

Primer design and melting profile analysis

Forward and reverse primers spanning at least one putative SNP were designed for HRM analysis using Primer 3 (Rozen and Skaletsky 2000). The pairs of primers were designed to have an annealing temperature at 60 ± 1°C and to give an expected product size of 60–100 bp with few exceptions. The primers were analysed using NetPrimer to detect possible secondary structures, i.e. primer dimer, hairpin, palindrome and repeats (http://www.premierbiosoft.com/netprimer/netprimer.html, Premier Biosoft International, Palo Alto, CA), as secondary structures of primers are thought to affect PCR amplification efficiency and therefore HRM accuracy. Secondary structure of the amplicons were analysed using the DINAMelt Server (http://www.bioinfo.rpi.edu/applications/hybrid/twostate-fold.php) (Markham and Zuker 2005). The amplicons were considered appropriate for HRM analysis when the ΔG value of the calculated secondary structure was >−1 (CorbettResearch 2006).

PCR amplification, DNA melting and end point fluorescence level acquiring

PCR amplifications were performed in a total volume of 10 μL on a Rotor-Gene 6500 realtime PCR Thermocycler (Corbett Research, Sydney, Australia) and PCR reaction preparation was automated by a CAS1200 liquid handling system (Corbett Research). The reaction mixture contained 40 ng almond genomic DNA, 1× PCR buffer, 2.5 mM MgCl2, 0.2 mM dNTP, 300 nM forward and reverse primers, 1.5 μM Syto® 9 green fluorescent nucleic acid stain, and 0.5 U Platinum Taq DNA polymerase (Invitrogen). The amplification was achieved by a touchdown PCR protocol: first denaturation at 95°C for 2 min, then 50 cycles denaturation at 95°C for 5 s, annealing and extension for 10 s at 62°C for the first cycle and thereafter at 0.5°C decrease each for 10 cycles, and a final extension at 72°C for 2 min. Before HRM, the products were denatured at 95°C for 5 s, and then annealed at 50°C for 30 s to randomly form DNA duplexes. HRM was performed as follows: pre-melt at the first appropriate temperature for 90 s, and melt at a ramp of 10°C in an appropriate temperature range (Table 2) at 0.1°C increments every 2 s. The fluorescent data were acquired at the end of each annealing step during PCR cycles and each of the HRM steps with automatic gain optimisation. End point fluorescence level was acquired following the melting process by holding at 60°C for 5 min and 5 cycles of 60°C for 20 s with fluorescence data being acquired at the end of each cycle step. PCR products were separated using 1.5% agarose or 8% polyacrylamide gel and stained with Gelstar (Cambrex Bio Science Rockland, Rockland, Maine) when required.

Table 2 PCR amplicons showing polymorphic HRM profiles in 25 almond cultivars

Amplification quality control and high resolution melting curve

Realtime data acquired during PCR and DNA melting were analysed using the Rotor-Gene 6500 series software (Corbett Research). For data quality control, PCR amplification was analysed through the assessment of the CT value, end point fluorescence level, and the amplification efficiency. The data from low quality amplification as indicated in any of the following three analyses were removed from HRM analysis. (1) CT value of the amplification was assessed using the Quantitation analysis module with manual setting of the threshold in the exponential phase of the run. Runs with CT value of less than 30 were considered suitable for the analysis. (2) The end point fluorescence levels were acquired and examined using the EndPoint Analysis module. Outliers having end point fluorescence less than 50% of average fluorescence of the samples were omitted from analysis. (3) Amplification efficiency was assessed using the Comparative Quantitation analysis module. The data from samples with amplification efficiency lower than 1.4 were eliminated from analysis.

High resolution melting curve analysis was performed using the HRM analysis module. The melting data were normalised by adjusting the start and end fluorescence signals, respectively, of all the samples analysed to the same levels. Difference plots of the melting data were visualised by selecting a target genotype for comparison. The melting data were also analysed using the Melt analysis module. In this analysis, negative first-derivative melting curves were produced from the fluorescence versus temperature plots. Genotypes were identified by examining normalised, difference and derivative melt plots.

DNA sequence analysis

Representative genotypes were chosen for sequencing. Fragments amplified from genomic DNA were sequenced to confirm expected SNPs or reveal new SNPs in their amplicons and flanking regions. PCR products containing the HRM amplicons were purified using a PureLink™ PCR Purification Kit following the protocol supplied by the manufacturer (Invitrogen). DNA samples were sent to the Australian Genome Research Facility (Brisbane, Australia) for sequencing where an ABI 3730xl sequencing platform and BigDye Terminator Version 3.1 (PE Applied Biosystems) were employed. The primary and secondary peak calling were made in Australian Genome Research Facility, and all sequencing traces were visually examined to ascertain the base calls made by the software. Sequences with a mixture of unequal lengths were reconstructed manually (Flot et al. 2006) or using an online program Champuru 1.0 (Flot 2007). Sequences obtained were assembled and SNPs in the contigs were identified. Intron sequences were determined by aligning genomic sequences with EST sequences using the BLAST 2 Sequences program (Tatusova and Madden 1999). Open reading frames (ORF) were identified for corresponding EST sequences using ORF Information of The Genome Database for Rosaceae (GDR) (http://www.bioinfo.wsu.edu/gdr/).

Nomenclature of SNPs markers

The nomenclature of the SNP markers for almond followed the rule reported in pearl millet (Bertin et al. 2005) with modifications. Taking Xneas001a0234EU919650 (9-cis-epoxycarotenoid dioygenase) for example, X indicates a DNA marker, ne identifies a University of New England marker, letter a correspond to almond, s symbolises a SNP marker, 001 represents a unique number for an EST contig, a recognises the HRM amplicon in the contig, 0234 is the identifier of a specific SNP in the contigs, EU919650 is the GenBank accession number of source sequence, (9-cis-epoxycarotenoid dioygenase) is the putative function of the gene and this functional identifier is optional. A simplified name of the amplicon is Xneas001a, and a simplified name of the SNP is Xneas001a0234. As HRM analysis is equally effective for some insertions or deletions (INDELs) and microsatellites, the INDEL and microsatellite markers analysed in the study are named following the same rule.

Allele frequency estimation and SNP information content

The allele frequency was estimated according to Griffiths et al. (2008), and the formula was modified to accommodate the multi-allelic situation:

$$ p = f_{\text{A/A}} + {\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 2}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$2$}}\sum {f_{\text{A/x}} } $$

where p stands for the frequency of the allele A, f A/A is the frequency of homozygous genotypes AA, f A/x is the frequency of heterozygous genotypes where allele A appears.

SNP polymorphic information content (PIC) was determined using the following equation (Botstein et al. 1980).

$$ \begin{gathered} {\text{PIC}}_{i} = 1 - \sum\limits_{j = 1}^{n} {Pij^{2} } \hfill \\ \hfill \\ \end{gathered} $$

where PIC i is the polymorphic information content of a SNP i, Pij is the frequency of the jth pattern for SNP i and the summation extends over n patterns.

Genetic distance and clustering

Genetic distance between almond cultivars was calculated as one minus the proportion of alleles shared (PSA), GD = 1−PSA (Bowcock et al. 1994), that was implemented by MICROSAT with 1,000 bootstraps (Minch et al. 1998). Clustering of the SNP data was undertaken by NEIGHBOR using the Neighbor-joining method, and a consensus tree from 1,000 replicates was drawn using the Consensus program of the PHYLIP software package (Felsenstein 1993), and viewed in TreeView (Page 1996).

Results

Identification of SNPs from the EST database

A total of 3,864 almond EST sequences were assembled into 276 contigs, and 17 EST contigs were predicted to have SNPs with a redundancy score ≥2 and adequate flanking sequences for primer design. In addition, 10 EST contigs with putative SNPs were selected from a Prunus SNP database at ESTree for HRM analysis. Altogether, 27 EST contig sequences were used for primer design to validate and/or identify SNPs by HRM analysis.

HRM analysis

From 27 EST contigs, HRM analysis demonstrated that 15 had polymorphic melting curves when assayed against the test population consisting of 25 almond cultivars, while 12 were monomorphic, indicating possible sequencing errors in the putative SNP sites of the EST, or that the rare SNP allele was not present in the almond population used in the present study. The polymorphisms identified by HRM analysis included SNPs, INDELs and microsatellite variations (Table 2). About 17 PCR amplifications derived from 15 EST contigs were listed in Table 2 as two amplicons were analysed in the sequences obtained from the EST contigs Xneas012 and Xneas 035. Flanking sequence information and related GenBank accession numbers of the SNPs are available in the dbSNP database (http://www.ncbi.nlm.nih.gov/SNP/). All four classes of SNPs (Liew et al. 2004) were detected by HRM analysis, and the genotypes with different SNP alleles were distinguished by distinct melting profiles. A/T transversions have been considered to be the SNP variation most difficult to resolve by melting analysis (Liew et al. 2004; SantaLucia 1998). In this study, the amplicon Xneas031b with A/T variation was distinctly differentiated (Fig. 1 g, h).

Fig. 1
figure 1

High resolution melting (HRM) curve profiles of four HRM amplicons. a Xneas280a with a class 1 SNP showing two genotypes in normalised melting plot, b difference plot of a, c Xneas109a with a class 2 SNP showing three genotypes in normalised melting plot, d difference plot of c, e Xneas012b with a class 3 SNP showing three genotypes in normalised melting plot, f difference plot of e, g Xneas031b with a class 4 SNP showing three genotypes in normalised melting plot, h difference plot of g, note the curve separation of two homozygotes

Single, double and multiple SNPs were present in the amplicons designed for HRM analysis. For a single SNP amplicon, two or three genotypes were detected by the HRM analysis as shown by the normalised melting curve and difference plot (Fig. 1 a–h). Four genotypes were identified in an HRM amplicon (Xneas001a) having double SNPs (Fig. 2). The melting curve of homozygous individuals had a single melting domain, and heterozygous individuals produced two or more as shown by normalised and derivative plots (Fig. 2). The genotypes heterozygous at one SNP site produced two melting domains and a genotype heterozygous at two SNP sites produced four melting domains.

Fig. 2
figure 2

High resolution melting (HRM) curve profiles of a HRM amplicon Xneas001a with double SNPs. a Normalised plot showing four genotypes in twenty-five almond cultivars, b difference plot of a, c normalised plot of a homozygous individual showing one melting domain (arrow), d derivative plot of c, e normalised plot of an individual heterozygous at one SNP site (T/C) showing two melting domains (arrows), f derivative plot of e, g normalised plot of an individual heterozygous at one SNP site (T/C) showing two melting domains (arrows), h derivative plot of g, i normalised plot of an individual heterozygous at two SNPs (T/C and T/C) showing four melting domains (arrows), j derivative plot of i

INDELs of one or more nucleotides were detected in this study. Two HRM assayed fragments were found to have INDELs in one or multiple positions. Figure 3 shows part of the sequence of Xneas012b, and its HRM profiles of the genotypes. In this amplicon, there were five positions of variations including one SNP and four nucleotide INDELs. The variations resulted in five variants of sequences. Consequently, in the 25 almond cultivars, five genotypes were identified. The heterozygous genotypes for the INDELs, i.e. genotype 4 and 5, showed significant melting domain transitions.

Fig. 3
figure 3

Sequence alignment and HRM profile of an amplicon Xneas012b containing one SNP and INDELs at four nucleotide positions. a Alignment of 5 sequence variants (Roman numerals) occurred at 5 nucleotide positions (arrows), b normalised plot of 5 genotypes (Arabic numerals) consisting of different variants (Roman numerals), c and d difference and derivative plots of b

A microsatellite (Xneas124a0250) was also observed to have significant HRM differentiation among the genotypes. The HRM analysis of the tri-nucleotide microsatellite is shown in Fig. 4. A 92-bp amplicon of the microsatellite was designed from EST consensus sequences for HRM analysis. HRM analysis revealed five genotypes in the almond population (Fig. 4a–c), and three alleles at 92, 89 and 83 bp were identified by polyacrylamide gel electrophoresis (Fig. 4d) and the polymorphism was confirmed by sequencing (Fig. 4e). Similar to the HRM result of the INDEL Xneas012b, two heterozygous genotypes (i.e. 4 and 5) showed significant melting domain transition.

Fig. 4
figure 4

HRM, polyacrylamide electrophoresis profile, and sequence alignment of a microsatellite amplicon Xneas124a. a HRM normalised plot of 5 genotypes (Arabic numerals), b and c difference and derivative plots of a, d a polyacrylamide gel micrograph showing five genotypes with three alleles, I = 92, II = 89 and III = 83 bp, e sequence alignment of 3 alleles (I, II and III)

Mixture HRM assay

The genotypes can be investigated by mixing equivalent amount of PCR products of two genotypes. The rationale is that the mixture of any two different homozygous genotypes of the same SNP will produce a melting curve corresponding to that of their heterozygous genotype. On the other hand, the mixture of two identical genotypes will produce an identical melting curve of the genotype whether they are homozygous or heterozygous. In application, we can apply this approach to confirm or identify homozygous and their corresponding heterozygous genotypes. This is particularly practical when the HRM profile shows more melting curve types than expected, or in the situation where the HRM curves of some individual samples show a slight shift from the major types of HRM curves. As shown in Fig. 5, Nonpareil and Lauranne produced homozygous HRM curves but were not consistent with the HRM curve of the homozygous TT genotype. If these two cultivars have different SNP alleles, the mixture of their PCR with the PCR of a TT individual will produce a heterozygous HRM curve. However, the mixture of an equivalent amount of Nonpareil and Lauranne PCR products with Keanes Seedling (TT) PCR product showed HRM curves the same as Milo, another homozygous TT individual. This result indicated that Nonpareil and Lauranne were homozygous TT as well.

Fig. 5
figure 5

HRM analysis of mixed genotypes of SNP Xneas076a0281. a HRM profile showing Nonpareil and Lauranne with melting curve of unknown genotype (?, red). b Mixture of equivalent volumes of samples of homozygous TT (blue) and unknown genotypes (Nonpareil and Lauranne) produced a curve (pink) identical to the curve of homozygous TT (blue). This proved that the unknown genotypes of Nonpareil and Lauranne were homozygous TT

SNP validation and discovery by sequencing

The genomic DNA fragments comprising HRM amplicons were sequenced to validate the SNPs predicted from ESTs, and uncover new SNPs. In total, 100 SNPs (single nucleotide INDELs are included) were recorded from 11,412 bp of genomic DNA. The average frequency of SNPs was 1:114 bp in the sequenced DNA representing the genic regions of the genome. The respective frequencies in coding region, intron and un-translated regions (UTRs) were 1:157, 1:130 and 1:51. The 100 SNPs consisted of 51 transitions (51%), 44 transversions (44%), and 5 INDELs (5%), resulting in an average transition to transversion ratio of 1.16:1. The distributions of nucleotide transitions and transversions varied in different sections of the genes. 57.1, 54.5 and 40.6% SNPs were transitions, and 34.3, 45.5 and 53.1% SNPs were transversions in the coding regions, introns and UTRs, respectively.

SNP informativeness

Two alleles and two to three genotypes were observed in the almond population for most of the SNPs analysed with HRM (Fig. 6). The frequencies of the rare alleles ranged from 2 to 50% in the population. The frequencies of major homozygous and heterozygous genotypes ranged from 24 to 96% and 4 to 72%, respectively. Thirteen out of 22 (59%) SNPs analysed only by HRM did not contain homozygous genotypes of the rare alleles, and the rest of the SNPs had 1–6 such genotypes out of 25 cultivars. The SNP Xneas035a0077 had three alleles at a nucleotide position comprising a major allele T (62%), one rare allele C (26%) and a single base INDEL (12%). In the SNP Xneas031b0365, two alleles A and T were equally distributed in the population, and interestingly, the genotype distribution of this SNP was in a ratio of 1:2:1 (AA:AT:TT, P < 0.05) resembling the allele distribution in a random mating population. In addition, the PIC of the SNPs varied from 0.04 (Xneas012b1332) to 0.53 (Xneas035a0077) at an average of 0.31.

Fig. 6
figure 6

Frequencies of SNP alleles and genotypes distributed in the almond population. a Allele frequencies of the SNPs—each contains 2 or rarely 3 alleles all together amounting the frequency of 100%, b frequency distribution of rare alleles, c genotype frequency of the SNPs—each has 2or 3 genotypes all together amounting the frequency of 100%, d frequency distribution of homozygous genotypes with two rare alleles

Application of HRM-SNP in classification of almond

Twenty-five almond cultivars were genotyped through HRM analysis and the data were used for the construction of a clustering dendrogram (Fig. 7). The results showed that the panel of 17 genic fragments including SNPs, INDELs and a microsatellite was able to resolve the genetic differences between these cultivars except between Price and Pethick Wonder. At the hierarchy levels where a small group of cultivars was clustered, the bootstrap values were comparatively high in most of the clusters, whereas at higher hierarchy levels, the bootstrap values became much lower (mostly <100). Therefore, the cluster tree can only reliably group cultivars with close relationships. Consequently, 25 almond cultivars were divided into six groups. All in One, a hybrid cultivar of almond and peach, is only cultivar clustered as a sole member in group I. Australian cultivars were clustered mainly in group II, with the Californian cultivar Thompson 1. The only Israeli cultivar in the study, Alnem 88, was clustered with Price, a Californian cultivar, and Pethick Wonder, an Australian cultivar. Four French cultivars grouped closely in group IV which also included an Australian cultivar Baxendale showing further relationships with the French cultivars. Californian cultivars fall mostly in groups V and VI, mingled with a few Australian and one French cultivar.

Fig. 7
figure 7

A dendrogram showing the genetic similarity clustering of 25 almond cultivars with 17 HRM amplicons including SNPs, INDELs and a microsatellite. The genetic distance was calculated as one minus the proportion of alleles shared (PSA), and clustering was produced using Neighbor-joining method. The dendrogram was a consensus from 1,000 replicates; bootstrap values >400 are shown. 1 Australian cultivars, 2 Californian cultivars, 3 French cultivars, 4 Israeli cultivar

Discussion

In total, 17 PCR amplicons designed from 15 EST contigs or genomic DNA introns that contain SNPs, INDELs and microsatellites were investigated using HRM based on the putative SNP information obtained from the almond and Prunus EST database. Through sequencing of the HRM amplicon and flanking regions, 100 SNPs including single nucleotide INDELs were determined in the population. HRM profiles of the fragments and assay techniques were established and the resulting SNP data were used to cluster the almond cultivars.

HRM is a novel, homogenous and close-tube post-PCR method that can be applied to analyse the genetic variations including SNPs, length polymorphism and methylations of DNA in PCR amplicons (Herrmann et al. 2007; White et al. 2007). HRM requires an additional step, the melting process following cycling, and an additional reagent, a specific generic DNA fluorescence dye, to fulfil the assay in comparison with conventional PCR. Therefore, the time and costs of the analysis is similar to conventional PCR and it omits the need for post-PCR separation required by many other assays. Therefore, a HRM assay has the advantages of speed, simplicity, and lower cost. The cost of HRM analysis has been reported as $1.50 per sample (Hung et al. 2008). The cost is estimated to be $0.70 in our experimental system because we used a smaller reaction volume (10 μL) as opposed to 25 μL used in Hung et al. (2008) and many other publications. The time taken for the assay is less than 2 h for up to 100 samples.

HRM has been applied to detect mutations in known genes mostly in humans (Bennett et al. 2003; Herrmann et al. 2006a; Kennerson et al. 2007; Willmore-Payne et al. 2005). The use of this approach for SNP analysis in plant species is very limited (Lehmensiek et al. 2008). This study extended the application of the HRM method to the development of SNP markers by designing an HRM assay based on the putative SNPs from EST databases. This approach takes advantage of the existing EST database, but avoided unnecessary sequencing efforts for putative SNPs in amplicons with invariant HRM curves in the test population. In our experiment, homozygous and heterozygous genotypes of all four SNP classes were proved to be distinguishable. In addition, the assay was able to resolve other variations including INDELs, microsatellites and complex multiple SNP amplicons.

HRM profiles of double heterozygous amplicons have been reported (Herrmann et al. 2007). In the present study, we have characterised two complex HRM amplicons, Xneas001a containing double SNPs (Fig. 2) and Xneas012b containing multiple nucleotide variations including a SNP and INDELs at four positions among which three are consecutive (Fig. 3). The Xneas001a amplicon showed one homozygous, two single heterozygous, and one double heterozygous genotypes. The double heterozygous amplicon of Xneas001a was revealed to have four melting domains in comparison to three melting domains of a double heterozygous amplicon reported by Herrmann et al. (2007). This may be attributable to the larger predicted ΔT m of two heteroduplexes of the amplicon than that of HBB c. (9C>T; 20A>T) in Herrmann et al. (2007). The predicted difference in melting temperatures (ΔT m) between two heteroduplexes of Xneas001a double heterozygotes was 0.7°C, whereas the ΔT m of two heteroduplexes of the HBB c. (9C>T; 20A>T) was only 0.1°C. Xneas012b is another HRM amplicon involving a double heterozygous genotype in this study. The variations in the amplicon, however, are single base deletions at two positions. Two melting domains with a significant melting transition were observed (Fig. 3, I + IV). Similar HRM melting curves were obtained from melting of the heterozygous genotype with a three-nucleotide deletion in Xneas012b (Fig. 3, III + V, and I + IV), and two genotypes of the tri-nucleotide microsatellite Xneas124a (Fig. 4, I + III and II + III). Melting analyses of a single heterozygote with deletion has been described previously with 1-(Lipsky et al. 2001) and up to 32-bp deletion (Rasmussen and Werge 2007). Nevertheless, we have not seen other melting analyses of a double heterozygous genotype with two deletion sites.

The mixture of equivalent amounts of two PCR products to determine their variation types was used in the present study. HRM measures the DNA disassociation in a solution system as is the case everywhere else for disassociation analysis of DNA molecules. Together with the characteristics of DNA, the components and concentrations of the ions and the volume of the solution in the assay system play important roles in the measurement. To compare the melting curves between samples, homogenous assay conditions are necessitated. As in all other laboratory work, the occurrence of systematic variations is inevitable in HRM assay. These include unforeseen variations of template DNA concentration, and subtle ion concentration increases due to sporadic evaporation in some samples. The mixture experiment described here is a simple way to examine the results generated from HRM with minimum effort, as the assayed PCR products can be reused following initial HRM analysis. The result shown in Fig. 5 has led to the proper interpretation of the two deviated melting curves of SNP Xneas076a0281. Moreover, when class 4 (A/T) SNP is assessed, it is very likely two homozygous genotypes show subtle differentiation. In such circumstances, the HRM results can be verified using a mixture experiment. This novel application of the mixture experiments in HRM genotype calling will be a valuable trouble-shooting and genotyping confirmation tool.

As expected, the SNP frequencies were lowest in the coding region (1:157), moderate in the intron (1:130), and highest in the UTR (1:51). The result is consistent with the findings in other taxa (Ching et al. 2002; Salisbury et al. 2003). The average ratio of transition to transversion was 1.16:1 in almond, which is a low ratio compared to that in other species, 2.45:1 in human (Salisbury et al. 2003), and 1.53:1 in maize (Ching et al. 2002). Transition bias over transversion has been considered universal in the genome (Rosenberg et al. 2003; Wakeley 1994) although contrary results have been described in grasshopper pseudogenes (Keller et al. 2007). The occurrence of transition bias has been considered to be partially due to cytosine methylation (Shen et al. 1994). Therefore, the low transition bias may reflect low methylation levels in the almond genome, and this could be significant because of the role of methylation in epigenetics and imprinting (Martienssen and Colot 2001). Similar to the previous observations, the UTR sequences showed the highest variation, while the coding region showed the lowest (Ching et al. 2002; Salisbury et al. 2003). Despite the mean transitional bias, we observed the transition to transversion ratio is low (1:0.78) in the UTR regions. As the sample of UTR sequences were small (1630 bp with 32 SNPs), this result is not yet conclusive.

The genotyping data from HRM analysis was used to investigate the relationships between almond cultivars. The panel of 17 genic fragments including SNPs, INDELs and a microsatellite can discriminate all the cultivars between each other except Price and Pethick Wonder as they have a very close relationship (Woolley et al. 2000). The clustering tree, however, was not able to define the precise relationships between the groups of cultivars as the bootstrap values were small at higher hierarchy levels. To define the precise relationships between groups, an increased number of SNPs or a combination of different marker types is needed. Woolley et al. (2000) and Xie et al. (2006) fingerprinted and constructed dendrograms of almond cultivars from different countries using RAPD and SSR markers. In the present study, most cultivars with common origins were clustered in groups, and showed similar relationship patterns. This is in agreement with previous findings (Woolley et al. 2000; Xie et al. 2006) but we formed smaller groups. The PIC average value was 0.31, and this corresponds to the heterozygosity index of Nei (1987). This indicates high heterozygosity of this panel of SNPs and therefore they are valuable markers for map constructions in almond populations.

HRM has been applied rapidly in SNP detection, mostly in diagnostic analysis and scanning for mutations in genes causing human diseases. We have shown this approach to be equally valuable in plant SNP discovery and identification. The advantages of the technique include low cost, simplicity, high-throughput capability and accuracy. While many high-throughput SNP detection approaches such as SNP microarray are cost efficient for whole genome scan in the species where genome-wide SNP information is available, it is expensive to assay small amount of SNPs using those methods. It is demonstrated that HRM is a feasible means for such assay. As increasing DNA sequence information becomes available for species such as almond, HRM will be a valuable method for SNP detection and genotyping. This is particularly useful in plant cultivar identification, genetic mapping, QTL analysis, diagnosis of pathogenic species, and gene discovery. Furthermore, data produced from HRM analysis is portable and therefore not only feasible for interrun genotype comparison but also for library based database construction (Stephens et al. 2008). This feature may facilitate international collaborative efforts using SNP based genotyping and therefore genetics and biodiversity studies by using HRM analysis.