Introduction

Date palm (Phoenix dactylifera L.) is an important fruit crop of family Arecaceae mostly grown in the arid regions of Africa, the Middle East, and South Asia (Al-Farsi and Lee 2008). It is one of the oldest known fruit trees cultivated for at least 5,000 years and reported to be originated from southern Iraq or the western Indian subcontinent (Zohary and Hopf 2000). The economic importance of date palm is due to its nutritionally valuable fruit which consists of 72–88% of sugar, minerals (i.e., iron, potassium, calcium, chlorine, copper, magnesium, sulfur, and phosphorus), amino acids, and vitamins (Al-Shahib and Marshall 2003). Moreover, antioxidant and antimutagenic activities of date fruit have also been reported (Vayalil 2002). The date palm tree can grow well in deserts with harsh climatic and soil conditions where the growth of other crops could be relatively difficult. Hence, the date palm offers high nutritive food in such areas (Al-Farsi et al. 2005). In 2006, the world date production was about seven million tonnes (www.faostat.fao.org).

Recently, a number of studies have addressed issues of genetic diversity among fruit-bearing plants including date palm (Zhang et al., 2011; He et al., 2011; Tanya et al., 2011; Xie et al., 2011). Younis et al. identified sex-specific DNA markers for date palm using RAPD and ISSR techniques (Younis et al. 2008). Similarly, the utilization of the RAPD-PCR approach has been proposed for date palm cultivar identification (Sedra et al. 1998; Al-Khalifah and Askari 2003; Abdulla and Gamal 2010). DNA polymorphism studies of selected cultivars revealed high genetic diversity in date palm (Elshibli and Korpelainen 2008; 2009a; 2009b). Marqués et al. (2008) have identified a set of RNAs transcribed from chloroplast genome reported to be involved in brittle leaf disease of date palm (Marqués et al. 2008). The chloroplast is an essential organelle of photosynthetic cells. In angiosperms, cpDNA is a highly conserved, double-stranded, circular molecule with size ranges from 120–220 kb (Gao et al. 2010; Khan et al. 2010). Typical chloroplast DNA consists of large and small single-copy regions (denoted as LSC and SSC, respectively) which are separated by two inverted repeat regions (denoted as inverted repeats, IRA and IRB; Ravi et al. 2008). The availability of complete plastid genome sequences from different clades of autotrophs has greatly resolved the organization and evolution of this interesting cellular organelle. Moreover, comparative chloroplast genomics can provide new knowledge regarding phylogenetics of green plants. Here we report the chloroplast genome sequence of the date palm cv. ‘Aseel’ grown in Pakistan, using Sanger-based and next-generation sequencing technologies. Initially in June 2009, we submitted the sequence of the inverted repeat region of cv. ‘Aseel’ cpDNA to GenBank (accession number FJ212316). The complete sequence was submitted in the GenBank in April 2010. While this paper was in preparation, the date palm chloroplast genome sequence from another cultivar, cv. ‘Khalas’ grown in the Saudi Arabian peninsula, was published (Yang et al. 2010). Hence, the detailed comparison of the cpDNA sequences from both cultivars of date palm, as well as a comparison with shorter date palm chloroplast DNA sequences available in GenBank, is presented here. A comparison of date palm cpDNA with available monocot species has also been carried out.

Materials and Methods

Plant Material

Fresh leaves from young cultivar ‘Aseel’ of date palm, cultivated in the botanical garden of the University of Karachi, Karachi, Pakistan were collected. The leaves were collected from a single date palm tree. Voucher specimens are kept at the Herbarium, Department of Botany, University of Karachi under voucher specimen number: 02 and General Herbarium number: 75539.

DNA Isolation and Sequencing

A combination of Sanger-based and next-generation sequencing strategies were used for DNA sequencing. The date palm cv ‘Aseel’ leaves (3.0 g) were processed for isolation of total DNA (20 μg) using a modified CTAB method (Porebski et al. 1997) and the commercially available Biopsin plant genomic DNA extraction kit (Bioer Technology, Hangzhou, PR China). Initially, a primer walking strategy termed as “ASAP: Amplification, sequencing & annotation of plastomes” (Dhingra and Folta 2005) was used for amplification and Sanger-based sequencing of the inverted repeat region of cpDNA. Briefly, purified date palm DNA (20 μg) was used for generation of 6.0 kb amplicons with consensus set of primers (Dhingra and Folta 2005). The 6.0 kb amplicons were then used for generation of 1.0 kb fragments using internal sets of primers corresponding to 6.0 kb amplicons. Later on, gap filling primers were designed to fill the gaps within the inverted repeat region (Table 1). The Sanger-based sequencing of the abovementioned fragments was carried out using a CEQ8000 Genetic Analyzer (Beckman Coulter Inc. USA) and an ABI3130 Analyzer (Applied Biosystem, USA). For cycle sequencing reactions, the DTCS kit (Beckman Coulter Inc. USA) and Big Dye Terminator kit (Applied Biosystem Inc, USA) were used, with conditions as recommended by the suppliers.

Table 1 Primers used for gap filling while sequencing inverted repeat (IR) regions of cpDNA date palm cv. Aseel

Complete sequencing of the date palm cpDNA was carried out by next-generation sequencing technology. For this purpose, a chloroplast-rich fraction was prepared from 10 g of date palm cv. ‘Aseel’ leaves followed by DNA purification (Triboush et al. 1998). The date palm DNA (7.0 μg) was then used for construction of paired-end libraries with insert size of 250 bp according to the protocol provided by the supplier (Illumina Inc. San Diego, USA). The massively parallel sequencing was carried out by the “sequencing by synthesis” approach using the HiSeq2000 system (Illumina Inc., San Diego, USA) in BGI, Shenzhen, China.

Genome Assembling, Annotation, and Analysis

The contiguous sequences obtained from Sanger-based sequencing were assembled using the Lasergene package version 7.1 (DNASTAR Inc., Madison, WI, USA). The sequencing data from the HiSeq2000 system was assembled using CLC Genomics Workbench version 3.5.1 (CLC bio, Denmark). The assembled sequences were combined using CLC Genomics Workbench (CLC bio, Denmark). Genome annotation was performed through the DOGMA server (Dual Organellar Genome Annotator; Wyman et al. 2004), ORF Finder (http://www.ncbi.nlm.nih.gov/projects/gorf/), and BLAST (Altschul et al. 1990). In addition, annotation of some tRNAs was performed using tRNAscan-SE (Lowe and Eddy 1997) and after similarity searches with other annotated plastomes. The beginnings and ends of genes were manually adjusted. Repeat analysis was performed using the REPuter program (Kurtz et al. 2001). A circular genome map of date palm cpDNA was constructed using the GenomeVx online tool (Conant and Wolfe 2008). The GeneOrder server was used for gene-order analysis (Celamkoti et al. 2004). Construction of multiple alignments and phylogenetic trees of complete cpDNA sequences was carried out by the mVISTA comparative genomics tool (Frazer et al. 2004). The maximum parsimony (MP)-based phylogenetic analysis of 25 protein-coding genes, i.e., matK, petA, petB, petD, petG, petN, psaB, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbN, psbT, rpoB, rpoC1, rpoC2, rps8, rps11, rps14, and ycf3 was done by MEGA4 (Tamura et al. 2007).

Results and Discussion

Genome Assembling and Organization

We carried out complete chloroplast genome sequencing of date palm (P. dactylifera L.) cv. ‘Aseel’ grown in Pakistan using Sanger-based and next-generation sequencing methods. Initially, 22,918 bp of the inverted repeat (IR) region were sequenced using the ASAP protocol (see Methods; Dhingra and Folta 2005; GenBank accession number FJ212316). The primers reported by Dhingra and Folta (2005) resulted in 84.0% coverage of the IR region. Subsequently, ten primers were designed (Table 1) to fill the gaps within the IR region, resulting in up to 96.5% sequence coverage of this region, i.e., 26,316 bp. The HiSeq2000 system (Illumina Inc. San Diego, USA) gave 2,197,575 high-quality paired-end reads with an average length of 73.5 bp. The Illumina reads were filtered (10% as default) so that no ambiguities remained. From this data, 267,669 reads (12.18% of all reads) assembled the complete date palm chloroplast genome cv. ‘Aseel’ with an average of 124X coverage, using the cpDNA of the Saudi Arabian date palm cv. ‘Khalas’ as reference (Yang et al. 2010). The unassembled reads (87.81%) were mostly from the nuclear genome due to nuclear DNA contamination during chloroplast DNA isolation. While ‘chimeric’ reads consisting of parts of nuclear and chloroplast DNA may be expected, due to the presence of nuclear copies of chloroplast DNA in all plants sequenced to date (e.g., Matsuo et al. 2005; Tuskan et al. 2006), and likewise for mitochondrial DNA (e.g., Hirai and Nakazon 1993; Tuskan et al. 2006), we understand that due to the short read length (72 bp) of the Illumina data, it would be difficult to identify such reads without ambiguity, as no complete nuclear or mitochondrial genomes of date palm are available to date (i.e., up to March, 2011). The reads matching cut-off was set to 90–95% during reference-based assembly via CLCBio workbench assembler. Yang et al. (2010) described difficulties in assembling of ‘454’ (also called GS FLX technology, Roche Applied Science, Germany) next-generation sequencing data across mono-nucleotide stretches. The ‘454’ technology has been reported to be error prone in sequencing mono-nucleotide repeats stretches. On the other hand, sequencing of cv. ‘Aseel’ was done by the Illumina technology, which has the advantage of precisely sequencing such homopolymer sequences (Mardis 2008).

The chloroplast genome sequence of date palm cv. ‘Aseel’ had a total length 158,458 bp with two IRs of 27,276 bp separated by a large single-copy region of 86,195 bp and a small single-copy region of 17,711 bp. The genome was 4 bp shorter than the chloroplast genome of Saudi Arabian cv. Khalas (total size 158,462 bp; Genbank accession GU811709). The genome contained 59% coding and 41% non-coding regions including pseudogenes, introns, and intergenic spacers. A total of 138 genes were present including pseudogenes Ψycf15, Ψycf68, and a Ψycf1 (short pseudo copy of ycf1 gene; Table 2). Out of 89 protein-coding open-reading frames, 16 genes contained introns. Among these the clpP, rps12, and ycf3 genes contained two introns each. Date palm chloroplast DNA contained 38 genes for tRNAs (30 distinct genes), and of these, eight tRNA genes contained introns. Four rRNA genes were confined to and duplicated in the IR regions. As a whole, 20 complete genes (including the ycf15 pseudogene) and one 3′-exon of the rps12 trans-splicing protein were duplicated in the IR regions. Of these 20 genes, eight were tRNA, four were rRNA, and seven were protein-coding genes.

Table 2 Genes in the chloroplast genome of Phoenix dactylifera L

Comparison with Saudi Arabian cv. ‘Khalas’

The chloroplasts in plant cell are considered by some as a population with genetic heterogeneity (e.g., Bendich 1987; Johnson and Palmer 1989; Fitter et al. 1996; Wolfe and Randle 2004). Analyses of high-quality sequence reads may therefore reveal polymorphic sites in chloroplast genomes. The cpDNA sequence variations can be partitioned into intravarietal polymorphisms (intraSNPs), i.e., sequence variations within a variety (or cultivar and subspecies), and inter-subspecific polymorphisms, i.e., sequence variations between different varieties of a species. These types of variations can be further characterized when one of the alleles becomes unique to a certain variety or subspecies (Tang et al., 2004; Yang et al., 2010).

The present study provided an opportunity to shed light on inter-subspecific polymorphisms in two “ecotypes” of date palm, i.e., the cultivars ‘Khalas’ and ‘Aseel’ grown in Saudi Arabia and Pakistan, respectively. The date palm chloroplast genome of the Saudi Arabian cv. ‘Khalas’ was reported by Yang et al. (2010). They achieved a sequence draft with 1,081X coverage using GS FLX (‘454’) next-generation sequencing technology. They observed intravarietal single nucleotide polymorphisms (intraSNPs) in date palm cpDNA. We carried out detailed sequence comparison of cpDNA from the Saudi Arabian and Pakistani date palm cultivars to determine “inter-subspecific” variations.

The comparison of IR regions indicated no sequence variations. It has been well established that mutation rate in the IR region is lower than single-copy regions of chloroplast genomes (Wolfe et al. 1987; Maier et al. 1995). In this IR region, however, Yang et al. (2010) found one intravarietal SNP in ycf2, harboring a T-G mutation at position 92,696 of Saudi Arabian date palm cv. ‘Khalas’. This type of intravarietal SNPs has been suggested to be considered as intervarietal variation among date palm cultivars (Yang et al. 2010). However, we could not find this mutation in cv. ‘Aseel’.

The following polymorphic sites were detected in the non-coding sequences of the LSC region of the two cultivars. (1) At positions 9,218 and 9,221, the cv. ‘Aseel’ data showed G(kh) → A(as) and C(kh) → T(as) mutations respectively with >50X coverage (kh = Khalas and as = Aseel). (2) Closer to the abovementioned position, a mono-nucleotide SSR (simple sequence repeat) with 17 poly-A repeat was detected at position 9,263–9,279 in cv. ‘Aseel’ compared to 15 poly-A repeat in cv. ‘Khalas’. (3) In the case of cv. ‘Aseel’ cpDNA, the rbcL–accD intergenic spacer region contained consecutive mono-nucleotide SSRs of 14 poly-C and 11 poly-A repeat units. However, cv. ‘Khalas’ cpDNA contains 13 poly-C and 12 poly-A repeats. (4) Yang et al. (2010) noted a characteristic 4-bp insertion of ‘TAGA’ at the position 61,482–61,485 in the accD-psaI intergenic region as a genotype in cv. ‘Khalas’ compared to other monocots with known cpDNA sequences. However, cv. ‘Aseel’ cpDNA sequence did not show this tetra-nucleotide insertion (at >100X coverage; Fig. 1). Hence, this site can be considered as a DNA marker for characterization of date palm cultivars.

Fig. 1
figure 1

Multiple alignment of the position 61,464–61,499 (date palm cv. ‘Khalas’ numbering) in the accD-psaI intergenic region of chloroplast DNA sequences from date palm cultivars ‘Khalas’ and ‘Aseel’, T. latifolia, D. elephantipes, and A. calamus showing four base-pair insertion in cv. ‘Khalas’

In the intergenic sequences of the SSC region of cv. ‘Aseel’, a mono-nucleotide SSR with nine poly-T repeat was detected at position 120,710 at 60X coverage compared to ten poly-T repeat in cv. ‘Khalas’. Comparative analysis of chloroplast genomes of cultivars ‘Aseel’ and ‘Khalas’ identified a G–T SNP at position 21,747 in the coding region of the rpoc1 gene, which results in a degenerate codon. In cv. ‘Aseel’, the sequence data gave >100X coverage of ‘T’ at this locus. This SNP has been identified as an intravarietal SNP in cv. ‘Khalas’ (Yang et al. 2010). This comparison showed that the variation occurred in non-coding regions, except for a SNP in the rpoc1 gene.

Comparison with Further GenBank Entries

A search in GenBank revealed 27 more short chloroplast DNA sequences of date palm; many of them analyzed in the frame of phylogenetic studies; and many of them as yet unpublished. Comparing these sequences to our genome sequence, a surprisingly high number of polymorphisms showed up (Table 3). However, the sequences at all these positions are identical in cv. ‘Aseel’ and ‘Khalas’. Among those, there are quite large indels (e.g., 51 and 53 bp), some of them shared among voucher specimen (e.g., the 12 bp deletion in nos. 14, 15, and 18 in Table 3). A number of polymorphisms in the usually highly conserved 16S ribosomal RNA gene were also observed (nos. 3–8 in Table 3). A surprisingly high number of polymorphisms (given the fewer polymorphisms that distinguish ‘Khalas’ and ‘Aseel’) resulted in amino acid changes in proteins (nos. 21–28 in Table 3), including a 17 amino acid deletion in the ndhI gene.

Table 3 A comparison of GenBank date palm cpDNA entries with cv. ‘Aseel’ cp genome sequence

It would be worth investigating and confirming these polymorphisms in a wider array of date palm accessions. We cannot exclude, however, the possibility that some polymorphisms may turn out as sequencing errors, or due to poor sequence quality (no. 28 in Table 3). Our detailed comparison of the two complete genomes, sequenced with high coverage, suggests this. On the other hand, studies with nuclear DNA markers have revealed high diversity among date palm cultivars (e.g., Elshibli and Korpelainen 2008, 2009a, 2009b), and it would not be uncommon then to find high levels of polymorphisms in its chloroplast DNA as well.

Comparison with Other Monocot Species

Currently, chloroplast genome sequences from six monocot families (i.e. Dioscoreaceae, Acoraceae, Orchidaceae, Araceae, Typhaceae, and Poaceae) are available in nucleotide databases. We compared date palm cpDNA sequences with ten species from six monocot families: one species each from Dioscoreaceae (Hansen et al. 2007), Orchidaceae (Chang et al. 2006), Araceae (Mardanov et al. 2008), and Typhaceae (Guisinger et al. 2010); two species from Acoraceae (GoremykinVV et al. 2005); and four species from Poaceae (Maier et al. 1995; Ogihara et al. 2000; Masood et al. 2004; Wu et al. 2009). Analysis showed that the total size of the date palm cpDNA was larger than the others, with the exception of Lemna minor (Araceae) and Typha latifolia (Typhaceae) (Table 4). The ‘AT’ and ‘GC’ percentages of the date palm genome are in close range with the other monocots (Table 4). Multiple alignments of full-length cpDNA sequences from 11 monocot species followed by phylogenetic tree construction using the mVISTA server (Frazer et al. 2004; Brudno et al. 2003) revealed a grouping of date palm cpDNA sequences with T. latifolia, Dioscorea elephantipes and Phalaenopsis aphrodite (Fig. 2a). However, the most closely related sequence was T. latifolia cpDNA (Guisinger et al. 2010). Furthermore, a maximum parsimony-tree based on 25 chloroplast protein-coding genes found in seven monocots (the sequence alignment that was used for phylogenetic analysis comprised 21,255 characters) showed that date palm and T. latifolia form a single clade with high bootstrap values, i.e., ≥95% (Fig. 2b).

Table 4 Comparison of the main features of date palm chloroplast genome with representative species of six monocots
Fig. 2
figure 2

(a) Phylogenetic tree of chloroplast genome sequences from ten monocot species including date palm (P. dactylifera L.) based on full-length multiple alignments. (b) Maximum-parsimony-based phylogenetic tree, derived from 25 concatenated chloroplast protein-coding genes sequences from representative species of different monocot families. Numbers at node indicate maximum parsimony bootstrap values

The chloroplast genomes of date palm and T. latifolia have the same gene contents. Unlike P. aphrodite, the date palm cpDNA contained a full set of ndh genes. Moreover, like L. minor and P. aphrodite, the rps12 gene in date palm cpDNA was uniquely divided into a 5′-exon located in the LSC region and two 3′-exons located in duplications within IRs. The monocot family Poaceae has lost three genes (i.e., accD, ycf1, and ycf2) and several introns within the clpP and rpoC1 genes; however, these are present in date palm and T. latifolia (Guisinger et al. 2010).

Conservation in gene order or synteny between date palm and T. latifolia plastomes was shown by the plot generated by GeneOrder 3.0 (Celamkoti et al. 2004) (Fig. 3a). However, due to a specific inversion within the LSC of family Poaceae, the pattern of some genes of Zea mays (Maier et al. 1995) were inverted in comparison to date palm (Figs. 3b and 4).

Fig. 3
figure 3

(a) Dot plot of gene order between P. dactylifera and T. latifolia chloroplast genome. The straight diagonal line represents the synteny between the comparing genome. The upper small counter diagonal line is because of genes to be inverted within inverted repeat regions. (b) Dot plot of P. dactylifera and Z. mays chloroplast genome showing small counter diagonal in lower region of plot indicating the genes inverted in LSC region of Z. mays chloroplast genome. The number “200” and “100” with sign of red box and blue cross, respectively, indicate the gene similarity score

Fig. 4
figure 4

The diagrammatic representation of large inversion in LSC region of Z. mays. As a result of this inversion the orientation of 16 genes of Z. mays are inverted in comparison to P. dactylifera and other monocots plastomes

Although among land plants, the chloroplast genomes are highly conserved in gene sets and order, the borders between IRs (IRA and IRB) and the two single-copy regions (LSC and SSC) are known to vary among species (Kim and Lee 2004). Considerable expansion and contraction of the IR region is mostly responsible for size variation in the chloroplast genome (Chung et al. 2006, Ravi et al. 2006). We here compare the position of IR borders in date palm and other monocot species. Due to a characteristic expansion of IRB sequences into the LSC region, a specific rearrangement was acquired by monocot chloroplast genomes early in evolution. This expansion resulted in the inclusion of trnH and rps19 genes in the IR region. Among monocots, Acorus calamus shows similarity to dicots and contains a single copy of rps19 in the LSC region, while in case of D. elephantipes only a 62-bp portion of rps19 has been found in IRb and seems to be in the mid of this evolutionary implication (Fig. 5). The L. minor plastome shows a contrary feature compared to other monocots due to location of rpl2 gene at the border of IRB/LSC, which resulted in a pseudo copy of rpl2 gene, i.e., Ψrpl2 in IRA region (Mardanov et al. 2008) (Fig. 5). Our analysis has shown that like most other monocot species, the date palm chloroplast genome has followed the same pattern, and IRB sequences have expanded into the LSC region. This expansion was also observed by Yang et al. (2010) for date palm cv. ‘Khalas’ cpDNA. This expansion resulted in two copies of trnH and rps19 genes in the IR regions. In the case of date palm, this IRB expansion was 15 bp more than in the closely related T. latifolia genome (Fig. 5). An extreme expansion of IRB was found for P. aphrodite, where a 31-bp inclusion of the rpl22 gene also occurred in the IR region. Furthermore, like other monocot plastomes, the date palm IRA is extended deep into the ycf1 gene and resulted in the 1,346-bp ycf1 pseudogene in IRB. The IRB/SSC border of the date palm chloroplast genome, located within the coding region of the ndhF gene, was not found in other monocot plastomes. Careful sequence analysis revealed 57 bp overlap between ndhF gene and ycf1 pseudogene at the IRB/SSC border in both date palm cultivars. However, Yang et al. (2010) observed a 55-bp overlapping region between these two genes. These expansions at IR/SC borders increase the length of the IR region of the date palm chloroplast genome compared to other monocot plastomes, except for L. minor.

Fig. 5
figure 5

Comparison of border positions of LSC, SSC, and IR among date palm and closely related monocot species. Boxes above the main line indicate the predicted genes while the pseudogenes at the borders are shown by Ψ (letter). The figure is not the scale and just shows relative changes at or near the IR–SC borders. AC: Acorus calamus, LM: Lemna minor, DE: Dioscorea elephantipes, TE: Typha latifolia, DP: Phoenix dactylifera, PA: Phalaenopsis aphrodite

Higher number of repeats and larger repeat sequences are associated with extensive chloroplast genome rearrangement (Haberle et al. 2008). Small forward and inverted repeats in cpDNA sequences from date palm and six other monocots, i.e., Z. mays, T. latifolia, P. aphrodite, L. minor, D. elephantipes, and A. calamus was computed using the REPuter program (Kurtz et al. 2001). Repeats of ≥30 bases were calculated with a Hamming distance of 3.0 (Kurtz et al. 2001). In date palm cpDNA, 64 repeats of ≥30 bases were found, of which 28 were inverted, while 36 were direct repeats (Fig. 6). The number of forward and inverted repeats of ≥30 bases in plastomes of other monocot species (i.e., A. calamus, D. elephantipes, L. minor, P. aphrodite, T. latifolia, and Z. mays) were 122, 17, 22, 37, 80, and 80, respectively. In date palm cpDNA, 12 repeats were located in the ycf2 gene and one repeat was in the psaB gene (Table 5), while the rest of the repeats belonged to non-coding regions. Table 5 contains the data about the repeats distribution in protein regions coding in cpDNA of date palm and six selected monocots. The gene order and repeat analyses supported the view of conserved arrangement of genes within the date palm chloroplast genome (Haberle et al. 2008).

Fig. 6
figure 6

A graphical output of repeats in date palm (P. dactylifera L.) chloroplast genome. The graph gives an overview of number, length, and location of repeats. The lines indicating repeats are colored according to respective repeat length. To keep the starting position information visible, each part of a repeat is displayed on a separate strand

Table 5 Distribution of repeats of sizes ≥30 bases (hamming distance = 3.0) in genes encoded by chloroplast genomes of P. dactylifera and selected monocot species

As a next step, the polymorphisms we describe between the two fully sequenced cultivars, and those between ‘Aseel’ and the GenBank entries, should be analyzed on a wide panel of date palm accessions—across its natural and cultivated ranges. We expect that some of the polymorphisms will turn out to be useful for a preliminary (chloroplast-based) phylogeography of the species, and some of them may even be useful for cultivar identification (most probably in combination with nuclear DNA markers, as those described in previous studies).