The Chloroplast Genome Sequence of Date Palm (Phoenix dactylifera L. cv. ‘Aseel’)

Khan, Asifullah; Khan, Ishtiaq A.; Heinze, Berthold; Azim, M. Kamran

doi:10.1007/s11105-011-0373-7

The Chloroplast Genome Sequence of Date Palm (Phoenix dactylifera L. cv. ‘Aseel’)

Published: 23 November 2011

Volume 30, pages 666–678, (2012)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Plant Molecular Biology Reporter Aims and scope Submit manuscript

The Chloroplast Genome Sequence of Date Palm (Phoenix dactylifera L. cv. ‘Aseel’)

Download PDF

Asifullah Khan¹,
Ishtiaq A. Khan¹,
Berthold Heinze² &
…
M. Kamran Azim¹

713 Accesses
32 Citations
Explore all metrics

Abstract

Date palm (Phoenix dactylifera L.) is an economically important and widely cultivated palm of the family Arecaceae. We sequenced the complete date palm chloroplast genome (cpDNA) from Pakistani cv. ‘Aseel’, using a combination of Sanger-based and next-generation sequencing technologies. Being very similar to a sequence from a Saudi Arabian date palm cultivar ‘Khalas’ published recently, the size of the genome was 158,458 bp with a pair of inverted repeat (IR) regions of 27,276 bp that were separated by a large single-copy (LSC) region of 86,195 bp and a small single-copy (SSC) region of 17,711 bp. Genome annotation demonstrated a total of 138 genes, of which 89 were protein coding, 39 were tRNA, and eight were rRNA genes. Comparison of cpDNA sequences of cultivars ‘Aseel’ and ‘Khalas’ showed following intervarietal variations in the LSC region; (a) two SNPs in intergenic spacers and one SNP in the rpoc1 gene, (b) polymorphism in two mono-nucleotide simple sequence repeats (SSR), and (c) a 4-bp indel in the accD-psaI intergenic spacer. The SSC region has a polymorphic site in the mono-nucleotide SSR located at position 120,710. We also compared cv. ‘Aseel’ cpDNA sequence with partial P. dactylifera cpDNA sequence entries deposited in Genbank and identified a number of potentially useful polymorphisms in this species. Analysis of date palm cpDNA sequences revealed a close relationship with Typha latifolia. Occurrence of small numbers of forward and inverted repeats in date palm cpDNA indicated conserved genome arrangement.

Whole mitochondrial and chloroplast genome sequencing of Tunisian date palm cultivars: diversity and evolutionary relationships

Article Open access 13 December 2023

The chloroplast genome sequence of Dipteryx alata Vog. (Fabaceae: Papilionoideae): genomic features and comparative analysis with other legume genomes

Article 09 April 2020

The Complete Chloroplast Genome of Pearl Millet (Pennisetum glaucum (L.) R. Br.) and Comparative Analysis within the Family Poaceae

Article 01 March 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Date palm (Phoenix dactylifera L.) is an important fruit crop of family Arecaceae mostly grown in the arid regions of Africa, the Middle East, and South Asia (Al-Farsi and Lee 2008). It is one of the oldest known fruit trees cultivated for at least 5,000 years and reported to be originated from southern Iraq or the western Indian subcontinent (Zohary and Hopf 2000). The economic importance of date palm is due to its nutritionally valuable fruit which consists of 72–88% of sugar, minerals (i.e., iron, potassium, calcium, chlorine, copper, magnesium, sulfur, and phosphorus), amino acids, and vitamins (Al-Shahib and Marshall 2003). Moreover, antioxidant and antimutagenic activities of date fruit have also been reported (Vayalil 2002). The date palm tree can grow well in deserts with harsh climatic and soil conditions where the growth of other crops could be relatively difficult. Hence, the date palm offers high nutritive food in such areas (Al-Farsi et al. 2005). In 2006, the world date production was about seven million tonnes (www.faostat.fao.org).

Recently, a number of studies have addressed issues of genetic diversity among fruit-bearing plants including date palm (Zhang et al., 2011; He et al., 2011; Tanya et al., 2011; Xie et al., 2011). Younis et al. identified sex-specific DNA markers for date palm using RAPD and ISSR techniques (Younis et al. 2008). Similarly, the utilization of the RAPD-PCR approach has been proposed for date palm cultivar identification (Sedra et al. 1998; Al-Khalifah and Askari 2003; Abdulla and Gamal 2010). DNA polymorphism studies of selected cultivars revealed high genetic diversity in date palm (Elshibli and Korpelainen 2008; 2009a; 2009b). Marqués et al. (2008) have identified a set of RNAs transcribed from chloroplast genome reported to be involved in brittle leaf disease of date palm (Marqués et al. 2008). The chloroplast is an essential organelle of photosynthetic cells. In angiosperms, cpDNA is a highly conserved, double-stranded, circular molecule with size ranges from 120–220 kb (Gao et al. 2010; Khan et al. 2010). Typical chloroplast DNA consists of large and small single-copy regions (denoted as LSC and SSC, respectively) which are separated by two inverted repeat regions (denoted as inverted repeats, IRA and IRB; Ravi et al. 2008). The availability of complete plastid genome sequences from different clades of autotrophs has greatly resolved the organization and evolution of this interesting cellular organelle. Moreover, comparative chloroplast genomics can provide new knowledge regarding phylogenetics of green plants. Here we report the chloroplast genome sequence of the date palm cv. ‘Aseel’ grown in Pakistan, using Sanger-based and next-generation sequencing technologies. Initially in June 2009, we submitted the sequence of the inverted repeat region of cv. ‘Aseel’ cpDNA to GenBank (accession number FJ212316). The complete sequence was submitted in the GenBank in April 2010. While this paper was in preparation, the date palm chloroplast genome sequence from another cultivar, cv. ‘Khalas’ grown in the Saudi Arabian peninsula, was published (Yang et al. 2010). Hence, the detailed comparison of the cpDNA sequences from both cultivars of date palm, as well as a comparison with shorter date palm chloroplast DNA sequences available in GenBank, is presented here. A comparison of date palm cpDNA with available monocot species has also been carried out.

Materials and Methods

Plant Material

Fresh leaves from young cultivar ‘Aseel’ of date palm, cultivated in the botanical garden of the University of Karachi, Karachi, Pakistan were collected. The leaves were collected from a single date palm tree. Voucher specimens are kept at the Herbarium, Department of Botany, University of Karachi under voucher specimen number: 02 and General Herbarium number: 75539.

DNA Isolation and Sequencing

A combination of Sanger-based and next-generation sequencing strategies were used for DNA sequencing. The date palm cv ‘Aseel’ leaves (3.0 g) were processed for isolation of total DNA (20 μg) using a modified CTAB method (Porebski et al. 1997) and the commercially available Biopsin plant genomic DNA extraction kit (Bioer Technology, Hangzhou, PR China). Initially, a primer walking strategy termed as “ASAP: Amplification, sequencing & annotation of plastomes” (Dhingra and Folta 2005) was used for amplification and Sanger-based sequencing of the inverted repeat region of cpDNA. Briefly, purified date palm DNA (20 μg) was used for generation of 6.0 kb amplicons with consensus set of primers (Dhingra and Folta 2005). The 6.0 kb amplicons were then used for generation of 1.0 kb fragments using internal sets of primers corresponding to 6.0 kb amplicons. Later on, gap filling primers were designed to fill the gaps within the inverted repeat region (Table 1). The Sanger-based sequencing of the abovementioned fragments was carried out using a CEQ8000 Genetic Analyzer (Beckman Coulter Inc. USA) and an ABI3130 Analyzer (Applied Biosystem, USA). For cycle sequencing reactions, the DTCS kit (Beckman Coulter Inc. USA) and Big Dye Terminator kit (Applied Biosystem Inc, USA) were used, with conditions as recommended by the suppliers.

Table 1 Primers used for gap filling while sequencing inverted repeat (IR) regions of cpDNA date palm cv. Aseel

Full size table

Complete sequencing of the date palm cpDNA was carried out by next-generation sequencing technology. For this purpose, a chloroplast-rich fraction was prepared from 10 g of date palm cv. ‘Aseel’ leaves followed by DNA purification (Triboush et al. 1998). The date palm DNA (7.0 μg) was then used for construction of paired-end libraries with insert size of 250 bp according to the protocol provided by the supplier (Illumina Inc. San Diego, USA). The massively parallel sequencing was carried out by the “sequencing by synthesis” approach using the HiSeq2000 system (Illumina Inc., San Diego, USA) in BGI, Shenzhen, China.

Genome Assembling, Annotation, and Analysis

The contiguous sequences obtained from Sanger-based sequencing were assembled using the Lasergene package version 7.1 (DNASTAR Inc., Madison, WI, USA). The sequencing data from the HiSeq2000 system was assembled using CLC Genomics Workbench version 3.5.1 (CLC bio, Denmark). The assembled sequences were combined using CLC Genomics Workbench (CLC bio, Denmark). Genome annotation was performed through the DOGMA server (Dual Organellar Genome Annotator; Wyman et al. 2004), ORF Finder (http://www.ncbi.nlm.nih.gov/projects/gorf/), and BLAST (Altschul et al. 1990). In addition, annotation of some tRNAs was performed using tRNAscan-SE (Lowe and Eddy 1997) and after similarity searches with other annotated plastomes. The beginnings and ends of genes were manually adjusted. Repeat analysis was performed using the REPuter program (Kurtz et al. 2001). A circular genome map of date palm cpDNA was constructed using the GenomeVx online tool (Conant and Wolfe 2008). The GeneOrder server was used for gene-order analysis (Celamkoti et al. 2004). Construction of multiple alignments and phylogenetic trees of complete cpDNA sequences was carried out by the mVISTA comparative genomics tool (Frazer et al. 2004). The maximum parsimony (MP)-based phylogenetic analysis of 25 protein-coding genes, i.e., matK, petA, petB, petD, petG, petN, psaB, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbN, psbT, rpoB, rpoC1, rpoC2, rps8, rps11, rps14, and ycf3 was done by MEGA4 (Tamura et al. 2007).

Results and Discussion

Genome Assembling and Organization

We carried out complete chloroplast genome sequencing of date palm (P. dactylifera L.) cv. ‘Aseel’ grown in Pakistan using Sanger-based and next-generation sequencing methods. Initially, 22,918 bp of the inverted repeat (IR) region were sequenced using the ASAP protocol (see Methods; Dhingra and Folta 2005; GenBank accession number FJ212316). The primers reported by Dhingra and Folta (2005) resulted in 84.0% coverage of the IR region. Subsequently, ten primers were designed (Table 1) to fill the gaps within the IR region, resulting in up to 96.5% sequence coverage of this region, i.e., 26,316 bp. The HiSeq2000 system (Illumina Inc. San Diego, USA) gave 2,197,575 high-quality paired-end reads with an average length of 73.5 bp. The Illumina reads were filtered (10% as default) so that no ambiguities remained. From this data, 267,669 reads (12.18% of all reads) assembled the complete date palm chloroplast genome cv. ‘Aseel’ with an average of 124X coverage, using the cpDNA of the Saudi Arabian date palm cv. ‘Khalas’ as reference (Yang et al. 2010). The unassembled reads (87.81%) were mostly from the nuclear genome due to nuclear DNA contamination during chloroplast DNA isolation. While ‘chimeric’ reads consisting of parts of nuclear and chloroplast DNA may be expected, due to the presence of nuclear copies of chloroplast DNA in all plants sequenced to date (e.g., Matsuo et al. 2005; Tuskan et al. 2006), and likewise for mitochondrial DNA (e.g., Hirai and Nakazon 1993; Tuskan et al. 2006), we understand that due to the short read length (72 bp) of the Illumina data, it would be difficult to identify such reads without ambiguity, as no complete nuclear or mitochondrial genomes of date palm are available to date (i.e., up to March, 2011). The reads matching cut-off was set to 90–95% during reference-based assembly via CLCBio workbench assembler. Yang et al. (2010) described difficulties in assembling of ‘454’ (also called GS FLX technology, Roche Applied Science, Germany) next-generation sequencing data across mono-nucleotide stretches. The ‘454’ technology has been reported to be error prone in sequencing mono-nucleotide repeats stretches. On the other hand, sequencing of cv. ‘Aseel’ was done by the Illumina technology, which has the advantage of precisely sequencing such homopolymer sequences (Mardis 2008).

The chloroplast genome sequence of date palm cv. ‘Aseel’ had a total length 158,458 bp with two IRs of 27,276 bp separated by a large single-copy region of 86,195 bp and a small single-copy region of 17,711 bp. The genome was 4 bp shorter than the chloroplast genome of Saudi Arabian cv. Khalas (total size 158,462 bp; Genbank accession GU811709). The genome contained 59% coding and 41% non-coding regions including pseudogenes, introns, and intergenic spacers. A total of 138 genes were present including pseudogenes Ψycf15, Ψycf68, and a Ψycf1 (short pseudo copy of ycf1 gene; Table 2). Out of 89 protein-coding open-reading frames, 16 genes contained introns. Among these the clpP, rps12, and ycf3 genes contained two introns each. Date palm chloroplast DNA contained 38 genes for tRNAs (30 distinct genes), and of these, eight tRNA genes contained introns. Four rRNA genes were confined to and duplicated in the IR regions. As a whole, 20 complete genes (including the ycf15 pseudogene) and one 3′-exon of the rps12 trans-splicing protein were duplicated in the IR regions. Of these 20 genes, eight were tRNA, four were rRNA, and seven were protein-coding genes.

Table 2 Genes in the chloroplast genome of Phoenix dactylifera L

Full size table

Comparison with Saudi Arabian cv. ‘Khalas’

The chloroplasts in plant cell are considered by some as a population with genetic heterogeneity (e.g., Bendich 1987; Johnson and Palmer 1989; Fitter et al. 1996; Wolfe and Randle 2004). Analyses of high-quality sequence reads may therefore reveal polymorphic sites in chloroplast genomes. The cpDNA sequence variations can be partitioned into intravarietal polymorphisms (intraSNPs), i.e., sequence variations within a variety (or cultivar and subspecies), and inter-subspecific polymorphisms, i.e., sequence variations between different varieties of a species. These types of variations can be further characterized when one of the alleles becomes unique to a certain variety or subspecies (Tang et al., 2004; Yang et al., 2010).

The present study provided an opportunity to shed light on inter-subspecific polymorphisms in two “ecotypes” of date palm, i.e., the cultivars ‘Khalas’ and ‘Aseel’ grown in Saudi Arabia and Pakistan, respectively. The date palm chloroplast genome of the Saudi Arabian cv. ‘Khalas’ was reported by Yang et al. (2010). They achieved a sequence draft with 1,081X coverage using GS FLX (‘454’) next-generation sequencing technology. They observed intravarietal single nucleotide polymorphisms (intraSNPs) in date palm cpDNA. We carried out detailed sequence comparison of cpDNA from the Saudi Arabian and Pakistani date palm cultivars to determine “inter-subspecific” variations.

The comparison of IR regions indicated no sequence variations. It has been well established that mutation rate in the IR region is lower than single-copy regions of chloroplast genomes (Wolfe et al. 1987; Maier et al. 1995). In this IR region, however, Yang et al. (2010) found one intravarietal SNP in ycf2, harboring a T-G mutation at position 92,696 of Saudi Arabian date palm cv. ‘Khalas’. This type of intravarietal SNPs has been suggested to be considered as intervarietal variation among date palm cultivars (Yang et al. 2010). However, we could not find this mutation in cv. ‘Aseel’.

The following polymorphic sites were detected in the non-coding sequences of the LSC region of the two cultivars. (1) At positions 9,218 and 9,221, the cv. ‘Aseel’ data showed G(kh) → A(as) and C(kh) → T(as) mutations respectively with >50X coverage (kh = Khalas and as = Aseel). (2) Closer to the abovementioned position, a mono-nucleotide SSR (simple sequence repeat) with 17 poly-A repeat was detected at position 9,263–9,279 in cv. ‘Aseel’ compared to 15 poly-A repeat in cv. ‘Khalas’. (3) In the case of cv. ‘Aseel’ cpDNA, the rbcL–accD intergenic spacer region contained consecutive mono-nucleotide SSRs of 14 poly-C and 11 poly-A repeat units. However, cv. ‘Khalas’ cpDNA contains 13 poly-C and 12 poly-A repeats. (4) Yang et al. (2010) noted a characteristic 4-bp insertion of ‘TAGA’ at the position 61,482–61,485 in the accD-psaI intergenic region as a genotype in cv. ‘Khalas’ compared to other monocots with known cpDNA sequences. However, cv. ‘Aseel’ cpDNA sequence did not show this tetra-nucleotide insertion (at >100X coverage; Fig. 1). Hence, this site can be considered as a DNA marker for characterization of date palm cultivars.

In the intergenic sequences of the SSC region of cv. ‘Aseel’, a mono-nucleotide SSR with nine poly-T repeat was detected at position 120,710 at 60X coverage compared to ten poly-T repeat in cv. ‘Khalas’. Comparative analysis of chloroplast genomes of cultivars ‘Aseel’ and ‘Khalas’ identified a G–T SNP at position 21,747 in the coding region of the rpoc1 gene, which results in a degenerate codon. In cv. ‘Aseel’, the sequence data gave >100X coverage of ‘T’ at this locus. This SNP has been identified as an intravarietal SNP in cv. ‘Khalas’ (Yang et al. 2010). This comparison showed that the variation occurred in non-coding regions, except for a SNP in the rpoc1 gene.

Comparison with Further GenBank Entries

A search in GenBank revealed 27 more short chloroplast DNA sequences of date palm; many of them analyzed in the frame of phylogenetic studies; and many of them as yet unpublished. Comparing these sequences to our genome sequence, a surprisingly high number of polymorphisms showed up (Table 3). However, the sequences at all these positions are identical in cv. ‘Aseel’ and ‘Khalas’. Among those, there are quite large indels (e.g., 51 and 53 bp), some of them shared among voucher specimen (e.g., the 12 bp deletion in nos. 14, 15, and 18 in Table 3). A number of polymorphisms in the usually highly conserved 16S ribosomal RNA gene were also observed (nos. 3–8 in Table 3). A surprisingly high number of polymorphisms (given the fewer polymorphisms that distinguish ‘Khalas’ and ‘Aseel’) resulted in amino acid changes in proteins (nos. 21–28 in Table 3), including a 17 amino acid deletion in the ndhI gene.

Table 3 A comparison of GenBank date palm cpDNA entries with cv. ‘Aseel’ cp genome sequence

Full size table

It would be worth investigating and confirming these polymorphisms in a wider array of date palm accessions. We cannot exclude, however, the possibility that some polymorphisms may turn out as sequencing errors, or due to poor sequence quality (no. 28 in Table 3). Our detailed comparison of the two complete genomes, sequenced with high coverage, suggests this. On the other hand, studies with nuclear DNA markers have revealed high diversity among date palm cultivars (e.g., Elshibli and Korpelainen 2008, 2009a, 2009b), and it would not be uncommon then to find high levels of polymorphisms in its chloroplast DNA as well.

Comparison with Other Monocot Species

Currently, chloroplast genome sequences from six monocot families (i.e. Dioscoreaceae, Acoraceae, Orchidaceae, Araceae, Typhaceae, and Poaceae) are available in nucleotide databases. We compared date palm cpDNA sequences with ten species from six monocot families: one species each from Dioscoreaceae (Hansen et al. 2007), Orchidaceae (Chang et al. 2006), Araceae (Mardanov et al. 2008), and Typhaceae (Guisinger et al. 2010); two species from Acoraceae (GoremykinVV et al. 2005); and four species from Poaceae (Maier et al. 1995; Ogihara et al. 2000; Masood et al. 2004; Wu et al. 2009). Analysis showed that the total size of the date palm cpDNA was larger than the others, with the exception of Lemna minor (Araceae) and Typha latifolia (Typhaceae) (Table 4). The ‘AT’ and ‘GC’ percentages of the date palm genome are in close range with the other monocots (Table 4). Multiple alignments of full-length cpDNA sequences from 11 monocot species followed by phylogenetic tree construction using the mVISTA server (Frazer et al. 2004; Brudno et al. 2003) revealed a grouping of date palm cpDNA sequences with T. latifolia, Dioscorea elephantipes and Phalaenopsis aphrodite (Fig. 2a). However, the most closely related sequence was T. latifolia cpDNA (Guisinger et al. 2010). Furthermore, a maximum parsimony-tree based on 25 chloroplast protein-coding genes found in seven monocots (the sequence alignment that was used for phylogenetic analysis comprised 21,255 characters) showed that date palm and T. latifolia form a single clade with high bootstrap values, i.e., ≥95% (Fig. 2b).

Table 4 Comparison of the main features of date palm chloroplast genome with representative species of six monocots

Full size table

The chloroplast genomes of date palm and T. latifolia have the same gene contents. Unlike P. aphrodite, the date palm cpDNA contained a full set of ndh genes. Moreover, like L. minor and P. aphrodite, the rps12 gene in date palm cpDNA was uniquely divided into a 5′-exon located in the LSC region and two 3′-exons located in duplications within IRs. The monocot family Poaceae has lost three genes (i.e., accD, ycf1, and ycf2) and several introns within the clpP and rpoC1 genes; however, these are present in date palm and T. latifolia (Guisinger et al. 2010).

Conservation in gene order or synteny between date palm and T. latifolia plastomes was shown by the plot generated by GeneOrder 3.0 (Celamkoti et al. 2004) (Fig. 3a). However, due to a specific inversion within the LSC of family Poaceae, the pattern of some genes of Zea mays (Maier et al. 1995) were inverted in comparison to date palm (Figs. 3b and 4).

Although among land plants, the chloroplast genomes are highly conserved in gene sets and order, the borders between IRs (IRA and IRB) and the two single-copy regions (LSC and SSC) are known to vary among species (Kim and Lee 2004). Considerable expansion and contraction of the IR region is mostly responsible for size variation in the chloroplast genome (Chung et al. 2006, Ravi et al. 2006). We here compare the position of IR borders in date palm and other monocot species. Due to a characteristic expansion of IRB sequences into the LSC region, a specific rearrangement was acquired by monocot chloroplast genomes early in evolution. This expansion resulted in the inclusion of trnH and rps19 genes in the IR region. Among monocots, Acorus calamus shows similarity to dicots and contains a single copy of rps19 in the LSC region, while in case of D. elephantipes only a 62-bp portion of rps19 has been found in IRb and seems to be in the mid of this evolutionary implication (Fig. 5). The L. minor plastome shows a contrary feature compared to other monocots due to location of rpl2 gene at the border of IRB/LSC, which resulted in a pseudo copy of rpl2 gene, i.e., Ψrpl2 in IRA region (Mardanov et al. 2008) (Fig. 5). Our analysis has shown that like most other monocot species, the date palm chloroplast genome has followed the same pattern, and IRB sequences have expanded into the LSC region. This expansion was also observed by Yang et al. (2010) for date palm cv. ‘Khalas’ cpDNA. This expansion resulted in two copies of trnH and rps19 genes in the IR regions. In the case of date palm, this IRB expansion was 15 bp more than in the closely related T. latifolia genome (Fig. 5). An extreme expansion of IRB was found for P. aphrodite, where a 31-bp inclusion of the rpl22 gene also occurred in the IR region. Furthermore, like other monocot plastomes, the date palm IRA is extended deep into the ycf1 gene and resulted in the 1,346-bp ycf1 pseudogene in IRB. The IRB/SSC border of the date palm chloroplast genome, located within the coding region of the ndhF gene, was not found in other monocot plastomes. Careful sequence analysis revealed 57 bp overlap between ndhF gene and ycf1 pseudogene at the IRB/SSC border in both date palm cultivars. However, Yang et al. (2010) observed a 55-bp overlapping region between these two genes. These expansions at IR/SC borders increase the length of the IR region of the date palm chloroplast genome compared to other monocot plastomes, except for L. minor.

Higher number of repeats and larger repeat sequences are associated with extensive chloroplast genome rearrangement (Haberle et al. 2008). Small forward and inverted repeats in cpDNA sequences from date palm and six other monocots, i.e., Z. mays, T. latifolia, P. aphrodite, L. minor, D. elephantipes, and A. calamus was computed using the REPuter program (Kurtz et al. 2001). Repeats of ≥30 bases were calculated with a Hamming distance of 3.0 (Kurtz et al. 2001). In date palm cpDNA, 64 repeats of ≥30 bases were found, of which 28 were inverted, while 36 were direct repeats (Fig. 6). The number of forward and inverted repeats of ≥30 bases in plastomes of other monocot species (i.e., A. calamus, D. elephantipes, L. minor, P. aphrodite, T. latifolia, and Z. mays) were 122, 17, 22, 37, 80, and 80, respectively. In date palm cpDNA, 12 repeats were located in the ycf2 gene and one repeat was in the psaB gene (Table 5), while the rest of the repeats belonged to non-coding regions. Table 5 contains the data about the repeats distribution in protein regions coding in cpDNA of date palm and six selected monocots. The gene order and repeat analyses supported the view of conserved arrangement of genes within the date palm chloroplast genome (Haberle et al. 2008).

Table 5 Distribution of repeats of sizes ≥30 bases (hamming distance = 3.0) in genes encoded by chloroplast genomes of P. dactylifera and selected monocot species

Full size table

As a next step, the polymorphisms we describe between the two fully sequenced cultivars, and those between ‘Aseel’ and the GenBank entries, should be analyzed on a wide panel of date palm accessions—across its natural and cultivated ranges. We expect that some of the polymorphisms will turn out to be useful for a preliminary (chloroplast-based) phylogeography of the species, and some of them may even be useful for cultivar identification (most probably in combination with nuclear DNA markers, as those described in previous studies).

References

Abdulla M, Gamal O (2010) Investigation on molecular phylogeny of some date palm (Phoenix dactylifra L.) cultivars by protein, RAPD and ISSR markers in Saudi Arabia. Aust J Crop Sci 4:23–28
CAS Google Scholar
Al-Farsi M, Alasalvar C, Morris A, Baron M, Shahidi F (2005) Compositional and sensory characteristics of three native sun-dried date (Phoenix dactylifera L.) varieties grown in Oman. J Agric Food Chem 53:7586–7591
Article PubMed CAS Google Scholar
Al-Farsi MA, Lee CY (2008) Nutritional and functional properties of dates: a review. Crit Rev Food Sci Nutr 48:877–887
Article PubMed CAS Google Scholar
Al-Khalifah NS, Askari E (2003) Molecular phylogeny of date palm (Phoenix dactyliferaL.) cultivars from Saudi Arabia by DNA fingerprinting. Theor Appl Genet 107:1266–1270
Article PubMed CAS Google Scholar
Al-Shahib W, Marshall RJ (2003) The fruit of the date palm: its possible use as the best food for the future? Int J Food Sci Nutr 54:247–259
Article PubMed Google Scholar
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410
PubMed CAS Google Scholar
Bendich AJ (1987) Why do chloroplasts and mitochondria contain so many copies of their genome? BioEssays 6:279–282
Article PubMed CAS Google Scholar
Brudno M, Do CB, Cooper GM, Kim MF, Davydov E, Green ED, Sidow A, Batzoglou S (2003) LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNA. Genome Research 13(4):721–731
Article PubMed CAS Google Scholar
Celamkoti S, Kundeti S, Purkayastha A, Mazumder R, Buck C, Seto D (2004) GeneOrder 3.0: software for comparing the order of genes in pairs of small bacterial genomes. BMC Bioinforma 5:52
Article Google Scholar
Chang C, Lin H, Lin I et al (2006) The chloroplast genome of Phalaenopsis aphrodite (Orchidaceae): comparative analysis of evolutionary rate with that of grasses and its phylogenetic implications. Mol Biol Evol 23:279–291
Article PubMed CAS Google Scholar
Chung HJ, Jung JD, Park HW, Kim JH, Cha HW, Min SR, Jeong WJ, Liu JR (2006) The complete chloroplast genome sequences of Solanum tuberosum and comparative analysis with Solanaceae species identified the presence of a 241-bp deletion in cultivated potato chloroplast DNA sequence. Plant Cell Rep 25:1369–1379
Article PubMed CAS Google Scholar
Conant GC, Wolfe KH (2008) GenomeVx: simple web-based creation of editable circular chromosome maps. Bioinformatics 2:861–862
Article Google Scholar
Dhingra A, Folta KM (2005) ASAP: amplification, sequencing & annotation of plastomes. BMC Genomics 6:176
Article PubMed Google Scholar
Elshibli S, Korpelainen H (2008) Microsatellite markers reveal high genetic diversity in date palm (Phoenix dactylifera L.) germplasm from Sudan. Genetica 134:251–260
Article PubMed CAS Google Scholar
Elshibli S, Korpelainen H (2009a) Excess heterozygosity and scarce genetic differentiation in the populations of Phoenix dactylifera L.: human impact or ecological determinants. Plant Genet Res 7:95–104
Article CAS Google Scholar
Elshibli S, Korpelainen H (2009b) Biodiversity of date palms (Phoenix dactylifera L.) in Sudan: chemical, morphological and DNA polymorphisms of selected cultivars. Plant Genet Res 7:194–203
Article CAS Google Scholar
Fitter JT, Thomas MR, Rose RJ, Steelescott N (1996) Heteroplasmy of the chloroplast genome of Medicago sativa L cv 'Regen S' confirmed by sequence analysis. Theor Appl Genet 93:685–690
Article CAS Google Scholar
Frazer KA, Pachter L, Poliakov A, Rubin EM, Dubchak I (2004) VISTA: computational tools for comparative genomics. Nucleic Acids Res 32:W273–W279
Article PubMed CAS Google Scholar
Gao L, Su Y-J, Wang T (2010) Plastid genome sequencing, comparative genomics, and phylogenomics: current status and prospects. J Syst Evol 48:77–93
Article Google Scholar
GoremykinVV HB, Hirsch-Ernst KI, Hellwig FH (2005) Analysis of Acorus calamus chloroplast genome and its phylogenetic implications. Mol Biol Evol 22:1813–1822
Article Google Scholar
Guisinger MM, Chumley TW, Kuehl JV, Boore JL, Jansen RK (2010) Implications of the plastid genome sequence of Typha (Typhaceae, Poales) for understanding genome evolution in Poaceae. J Mol Evol 70:149–166
Article CAS Google Scholar
Haberle R, Fourcade H, Boore J, Jansen R (2008) Extensive rearrangements in the chloroplast genome of Trachelium caeruleum are associated with repeats and tRNA genes. J Mol Evol 66:350–361
Article PubMed CAS Google Scholar
Hansen DR, Dastidar SG, Cai Z et al (2007) Phylogenetic and evolutionary implications of complete chloroplast genome sequences of four early-diverging angiosperms: Buxus (Buxaceae), Chloranthus (Chloranthaceae), Dioscorea (Dioscoreaceae), and Illicium (Schisandraceae). Mol Phylogenet Evol 45:547–563
Article PubMed CAS Google Scholar
He Q, Li XW, Liang GL, Ji K, Guo QG, Yuan WM, Zhou GZ, Chen KS, Van de Weg WE, Gao ZS (2011) Genetic diversity and identity of Chinese loquat cultivars/accessions (Eriobotrya japonica) using apple SSR markers. Plant Mol Biol Rep 29:197–208
Article Google Scholar
Hirai A, Nakazon M (1993) Six percent of the mitochondrial genome in rice came from chloroplast DNA. Plant Mol Biol Rep 11(2):98–100
Google Scholar
Johnson L, Palmer J (1989) Heteroplasmy of chloroplast DNA in Medicago. Plant Mol Biol 12:3–11
Article CAS Google Scholar
Khan A, Khan IA, Asif H, Azim MK (2010) Current trends in chloroplast genomics. Afric J Biotech 9:3494–3500
CAS Google Scholar
Kim KJ, Lee HL (2004) Complete chloroplast genome sequences from Korean ginseng (Panax schinseng Nees) and comparative analysis of sequence evolution among 17 vascular plants. DNA Res 11:247–261
Article PubMed CAS Google Scholar
Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R (2001) REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res 29:4633–4642
Article PubMed CAS Google Scholar
Lowe TM, Eddy SR (1997) tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25:955–964
PubMed CAS Google Scholar
Maier RM, Neckermann K, Igloi GL, Kossel H (1995) Complete sequence of the maize chloroplast genome: gene content, hotspots of divergence and fine tuning of genetic information by transcript editing. J Mol Biol 251:614–628
Article PubMed CAS Google Scholar
Mardanov AV, Ravin NV, Kuznetsov BB et al (2008) Complete sequence of the duckweed (Lemna minor) chloroplast genome: structural organization and phylogenetic relationships to other angiosperms. J Mol Evol 66:555–564
Article PubMed CAS Google Scholar
Mardis ER (2008) Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet 9:387–402
Article PubMed CAS Google Scholar
Marqués J, Fadda ZG, Duran-Vila N, Flores R, Bové JM, Daròs JA (2008) A set of novel RNAs transcribed from the chloroplast genome accumulates in date palm leaflets affected by brittle leaf disease. Phytopathol 98:337–344
Article Google Scholar
Masood MS, Nishikawa T, Fukuoka S, Njenga PK, Tsudzuki T, Kadowaki K (2004) The complete nucleotide sequence of wild rice (Oryza nivara) chloroplast genome: first genome wide comparative sequence analysis of wild and cultivated rice. Gene 340:133–139
Article Google Scholar
Matsuo M, Ito Y, Yamauchi R, Obokata J (2005) The rice nuclear genome continuously integrates, shuffles, and eliminates the chloroplast genome to cause chloroplast-nuclear DNA flux. Plant Cell 17:665–675
Article PubMed CAS Google Scholar
Ogihara Y, Isono K, Kojima T et al (2000) Chinese spring wheat (Triticum aestivum L.) chloroplast genome: complete sequence and contig clones. Plant Mol Biol Rep 18:243–253
Article CAS Google Scholar
Porebski S, Bailey LG, Baum BR (1997) Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant Mol Biol Rep 15:8–15
Article CAS Google Scholar
Ravi V, Khurana J, Tyagi A, Khurana P (2006) The chloroplast genome of mulberry: complete nucleotide sequence, gene organization and comparative analysis. Tree Genet Genomes 3:49–59
Article Google Scholar
Ravi V, Khurana JP, Tyagi AK, Khurana P (2008) An update on chloroplast genomes. Plant Syst Evol 271:101–102
Article CAS Google Scholar
Sedra MyH, Lashermes P, Trouslot P, Combes M-C (1998) Identification and genetic diversity analysis of date palm (Phoenix dactylifera L.) varieties from Morocco using RAPD markers. Euphytica 103:75–82
Article CAS Google Scholar
Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol Biol Evol 24:1596–1599
Article PubMed CAS Google Scholar
Tang J, Xia H, Cao M, Zhang X, Zeng W, Hu S, Tong W, Wang J, Wang J, Yu J, Yang H, Zhu L (2004) A comparison of rice chloroplast genomes. Plant Physiol 135:412–420
Article PubMed CAS Google Scholar
Tanya P, Taeprayoon P, Hadkam Y, Srinives P (2011) Genetic diversity among Jatropha and Jatropha-related species based on ISSR markers. Plant Mol Biol Rep 29:252–264
Article Google Scholar
Triboush SO, Danilenko NG, Davydenko OG (1998) A method for isolation of chloroplast DNA and mitochondrial DNA from sunflower. Plant Mol Biol Rep 16:183–189
Article CAS Google Scholar
Tuskan GA, DiFazio S, Jansson S et al (2006) The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313:1596–1604
Article PubMed CAS Google Scholar
Vayalil PK (2002) Antioxidant and antimutagenic properties of aqueous extract of date fruit (Phoenix dactylifera L. Arecaceae). J Agric Food Chem 50:610–617
Article PubMed CAS Google Scholar
Wolfe AD, Randle CP (2004) Recombination, heteroplasmy, haplotype polymorphism, and paralogy in plastid genes: implications for plant molecular systematics. Syst Bot 29:1011–1020
Article Google Scholar
Wolfe KH, Li WH, Sharp PM (1987) Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc Natl Acad Sci 84:9054–9058
Article PubMed CAS Google Scholar
Wu F, Kan D, Lee SB et al (2009) Complete nucleotide sequence of Dendrocalamus latiflorus and Bambusa oldhamii chloroplast genomes. Tree Physiol 29:847–856
Article PubMed CAS Google Scholar
Wyman SK, Jansen RK, Boore JL (2004) Automatic annotation of organellar genomes with DOGMA. Bioinformatics 20:3252–3255
Article PubMed CAS Google Scholar
Xie R-J, Zhou J, Wang G-Y, Zhang S-M, Chen L, Gao Z-S (2011) Cultivar identification and genetic diversity of Chinese bayberry (Myrica rubra) accessions based on fluorescent SSR markers. Plant Mol Biol Rep 29:554–562
Article Google Scholar
Yang M, Zhang X, Liu G, Yin Y, Chen K, Yun Q, Zhao D, Al-Mssallem IS, Yu J (2010) The complete chloroplast genome sequence of date palm (Phoenix dactylifera L.). PLoS One 5:e12762
Article PubMed Google Scholar
Younis RAA, Ismail OM, Soliman SS (2008) Identification of sex-specific DNA markers for date palm (Phoenix dactylifera L.) using RAPD and ISSR techniques. Res J Agric Biol Sci 4:278–284
CAS Google Scholar
Zhang Q, Li J, Zhao Y, Schuyler SK, Han Y (2011) Evaluation of genetic diversity in Chinese wild apple species along with apple cultivars using SSR markers. Plant Mol Biol Rep Early online. doi:10.1007/s11105-011-0366-6
Zohary D, Hopf M (2000) Domestication of plants in the old world: the origin and spread of cultivated plants in West Asia, Europe, and the Nile Valley, 3rd edn. Oxford University Press, Oxon
Google Scholar

Download references

Author information

Authors and Affiliations

International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan
Asifullah Khan, Ishtiaq A. Khan & M. Kamran Azim
Department of Genetics, Federal Research Centre for Forests, Hauptstraße 7, 1140, Vienna, Austria
Berthold Heinze

Authors

Asifullah Khan
View author publications
You can also search for this author in PubMed Google Scholar
Ishtiaq A. Khan
View author publications
You can also search for this author in PubMed Google Scholar
Berthold Heinze
View author publications
You can also search for this author in PubMed Google Scholar
M. Kamran Azim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Kamran Azim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khan, A., Khan, I.A., Heinze, B. et al. The Chloroplast Genome Sequence of Date Palm (Phoenix dactylifera L. cv. ‘Aseel’). Plant Mol Biol Rep 30, 666–678 (2012). https://doi.org/10.1007/s11105-011-0373-7

Download citation

Published: 23 November 2011
Issue Date: June 2012
DOI: https://doi.org/10.1007/s11105-011-0373-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The Chloroplast Genome Sequence of Date Palm (Phoenix dactylifera L. cv. ‘Aseel’)

Abstract

Similar content being viewed by others

Whole mitochondrial and chloroplast genome sequencing of Tunisian date palm cultivars: diversity and evolutionary relationships

The chloroplast genome sequence of Dipteryx alata Vog. (Fabaceae: Papilionoideae): genomic features and comparative analysis with other legume genomes

The Complete Chloroplast Genome of Pearl Millet (Pennisetum glaucum (L.) R. Br.) and Comparative Analysis within the Family Poaceae

Introduction