Abstract
The complete nucleotide sequence of the chloroplast genome of potato Solanum tuberosum L. cv. Desiree was determined. The circular double-stranded DNA, which consists of 155,312 bp, contains a pair of inverted repeat regions (IRa, IRb) of 25,595 bp each. The inverted repeat regions are separated by small and large single copy regions of 18,373 and 85,749 bp, respectively. The genome contains 79 proteins, 30 tRNAs, 4 rRNAs, and unidentified genes. A comparison of chloroplast genomes of seven Solanaceae species revealed that the gene content and their relative positions of S. tuberosum are similar to the other six Solanaceae species. However, undefined open reading frames (ORFs) in LSC region were highly diverged in Solanaceae species except N. sylvestris. Detailed comparison was identified by numerous indels in the intergenic regions that were mostly located in the LSC region. Among them, a single large 241-bp deletion, was not associated with direct repeats and found in only S. tuberosum, clearly discriminates a cultivated potato from wild potato species Solanum bulbocastanum. The extent of sequence divergence may provide the basis for evaluating genetic diversity within the Solanaceae species, and will be useful to examine the evolutionary processes in potato landraces.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Chloroplasts are intracellular organelles that have their own genome in which a number of genes are encoded for the chloroplast components and photosynthesis. Chloroplast genomes vary in size from 35 to 217 kb, but in land plants, most of them are between 115 and 165 kb and exhibit a high similarity in their structure and gene organization (Jansen et al. 2005). The overall structure of chloroplast genome consists of two copies of an inverted repeat (IR) ranging from 5 to 76 kb in length. These copies are separated by a large single-copy (LSC) and a small single-copy (SSC) region (Sugiura et al. 1998).
Although the overall structure of the chloroplast is relatively uniform and is conserved in land plants, a number of mutations have been observed in the chloroplast genome. The mutations include structural changes such as inversions (Kim et al. 2005; Kim and Lee 2005; Sugiura et al. 2003), rearrangements of gene order (Cosner et al. 2004; Saski et al. 2005), and insertions/deletions (indels) (Calsa Junior et al. 2004; Kato et al. 2000; Ogihara et al. 2002; Shahid Masood et al. 2004) as well as base substitutions (Schmitz-Linneweber et al. 2002). A comparison of the complete chloroplast sequences of the closely related grass family, maize, wheat, rice, and sugarcane, reveals the presence of several hotspots for length mutations (Asano et al. 2004; Calsa Junior et al. 2004; Guo and Terachi 2005; Maier et al. 1995; Ogihara et al. 2002; Ogihara et al. 1991). One of the divergent hotspots is the tRNA gene cluster, trnS(UGA), trnG(GCC), trnfM(CAU), trnG(UCC), trnT(GGU), trnE(UUC), trnY(GUA), trnD(GUC), trnC(GCA) in the LSC region. In the intergenic region of this tRNA gene cluster, a large number of indels ranging from 1 to 811 bp in length are commonly presented. The largest deletions over 500 bp in size are found at upstream of trnC(GCA) and trnT(GGU), and at downstream of trnD(GUC) in rice chloroplast (Calsa Junior et al. 2004; Maier et al. 1995; Ogihara et al. 2002). The hotspot of divergence is also recognized in the region between rbcL and cemA. In dicots, accD, psaI, and ycf3 genes are present within this region. However, in three grass species such as rice, wheat, and maize, there occurs the deletion of accD gene and the non-reciprocal translocation of rpl23 (Maier et al. 1995). Another hot spot of divergence characterized in grass species is in the ycf2 gene. In dicots, the ycf2 gene is conserved and located along with ycf15, ORF115, and ORF92 between trnI and trnL. However, one or two deletions occur within the ycf2 gene which generates several open reading frames (ORFs) in case of maize, rice, wheat and sugarcane (Calsa Junior et al. 2004; Maier et al. 1995; Ogihara et al. 2002).
Currently, complete chloroplast DNA sequences of 48 species, ranging from single-cell organism Euglena longa to land plant Oryza sativa, have been determined (NCBI: Organelle Genomes: http://www.ncbi.nlm.nih.gov: 80/genomes/static.euk_o.html). Among the land plants, complete chloroplast genome sequence has been first accomplished in tobacco (N. tabacum), which belongs to the Solanaceae family (Shinozaki et al. 1986). The Solanaceae family is composed of more than 3000 species, including potato (S. tuberosum), tomato (Solanum lycopersicum), eggplant (Solanum melongena), pepper (Capsicum annuum), and petunia (Petunia hybrida). Potato is an herbaceous perennial cultivated plant bearing tuber that is commercially significant portion of the plant. Over the years, potato has become an important crop for both farmers and consumers, and is the fourth important crop in the world in terms of production, after rice, wheat, and maize. Reflecting the importance of the crop, there are growing numbers of researches into the genetic engineering of potatoes, including the production of GM potatoes such as insect resistance, virus resistance and changes in nutritional quality such as starch or protein content.
By this time, the entire chloroplast DNA sequence of the three species from the Solanaceae family, N. sylvestris (AB237912), N. tomentosiformis (AB240139), and A. belladonna (AJ316582), along with N. tabacum (Z00044), have been additionally determined. Very recently, at the time of this study, the chloroplast DNA sequences from two Solanaceaes, S. lycopersicum (DQ347959) and S. bulbocastanum (wild species potato, DQ347958), were also widely available. Among the Solanaceae species, comparative analysis of chloroplast DNA sequences has been performed between N. tabacum and A. belladonna (Schmitz-Linneweber et al. 2002), between S. lycopersicum and S. bulbocastanum (Daniell et al. 2006), and between Nicotiana species (Yukawa et al. 2006). In this paper, we present the complete chloroplast DNA sequence from S. tuberosum L. cv. Desiree and the comparison of chloroplast genome sequences with those of the other six Solanaceae species, N. tabacum, N. sylvestris, N. tomentosiformis, S. lycopersicum, S. bulbocastanum and A. belladonna. We especially focus on the ORFs and the length mutations. Our study will provide a rich source of the nucleotide and amino acid sequence data, which can be utilized to address phylogenetic and molecular evolutionary question to engineer as well as to breed potatoes.
Materials and methods
Isolation of chloroplast DNA
Fresh leaves were harvested from 2 to 3 weeks old S. tuberosum L. cv. Desiree. Chloroplasts were isolated by the sucrose-gradient method as described in Oharamays and Capwell (1993). Chloroplast DNAs (cpDNAs) were isolated from the purified chloroplast by lysis and ultracentrifugation. Template DNA was prepared with 100 ng of genomic DNA by polymerase chain reaction (PCR). Amplification was performed using ExTaq (Takara Bio Inc., http://www.takara-bio.com) and with the following conditions: 5 min denaturation at 95°C followed by 35 cycles of denaturation at 95°C for 50 s, annealing at 55–60°C for 45 s, and extension at 72°C for 5 min. PCR products were about 4000–5000 bp in size, and each product was overlapped with adjacent fragments in 500–800 bp. Primers were designed according to the chloroplast sequence of N. tabacum.
Shotgun library construction
Twenty micrograms of five or six combined PCR products were sheared into approximately 1 kb in size by HydroShear DNA shearing device at speed code 3 (GeneMachines, http://www.bst-asia.com). The sheared DNA was blunted and phosphorylated by using the DNA End-Repair kit (Epicentre, http://www.epibio.com) according to manufacturer's instructions. The end-repaired DNA was fractionated by MicroSpin S-400 HR columns (Amersham, http://www. amershambiosciences.com), and DNAs of approximately 0.5–1.5 kb were isolated from the end-repaired DNA. The fragments were ligated into a pUC118 plasmid vector that previously had been digested with HincII and treated with bacterial alkaline phosphatase. The ligated DNA samples were introduced into Escherichia coli DH5α by electroporation, and plated on LB Ampicillin plate. The titers of five libraries were 0.7×103 to 2.0×103 cfu.
Sequencing, assembly and annotation
Individual clones were picked into deep well blocks containing 760 μl of TB with 8% glycerol and 50 μg ml−1 ampicillin. The clones were grown overnight at 37°C, shaking at 600 rpm, and plasmid DNA was isolated using a HT prep machine (Bioneer, http://www.bioneer.co.kr). The 5′ and 3′ DNA sequences were determined by a capillary DNA sequencer (RISA 384, Shimadzu, http://www.shimadzu.com) using the DYEnamic ET Terminator cycle sequencing kit (Amersham, http://www.amershambiosciences.com). The full sequences of three large gaps (1, 8, and 17), which are not amplified by PCR, were determined by a primer walking method. The remaining 31 small gaps were amplified by PCR and sequenced directly from the PCR products. The sequence data were processed by the base-calling program Phred and the assembler Phrap (version 0.990319, http://www.genome.washington.edu/UWGC). The resulting contigs were analyzed in Consed, a powerful software package used for sequence finishing (http://www.phrap.org/consed/consed.html) (Gordon et al. 1998). Identification and annotation of genes in S. tuberosum chloroplast genome were performed using DOGMA (Dual Organellar GenoMe annotator) (Wyman et al. 2004). This program takes a FASTA-formatted input file of the complete genomic sequences and identifies putative protein-coding genes by performing BLASTX searches against a custom database of 16 published chloroplast genomes of green plants, including Arabidopsis, Chlorella, Lotus, Oenothera, Oryza, Pinus, Spinacia, Zea, and Nicotiana, etc. In addition, a stand-alone BLAST search was performed for comparative analysis of all the known chloroplast genes against a database of the S. tuberosum chloroplast sequences (Altschul et al. 1997). Alignment of the chloroplast genome sequences of seven Solanaceae species was performed using the BioEdit sequence alignment editor (North Carolina State University).
Phylogenetic analysis of ORFs
Seventeen ORFs in N. tabacum were used as queries in BLAST search against a database of chloroplast DNA sequence from seven Solanaceae species, Cucumis sativus and Spinacia oleracea, with normal stringency. Retrieved sequences from 17 ORF regions were readily aligned manually and unambiguously using Sequencher 4.1 (Gene Codes, http://www.genecodes.com). And we used Clustal X (Thompson et al. 1997) to align the sequences with varying gap opening and extension penalties, followed by some degree of manual editing in Sequencher. The aligned sequences were analyzed in combination using the parsimony algorithm of PAUP* for Macintosh (version 4.0b10; Swofford 1998). All characters were weighted equally (Fitch, 1971) and unordered; gaps were treated as the fifth base. Heuristic searches were conducted with ‘MULPARS’, TBR branch swapping, and ‘ACCTRAN’ optimization. Internal support was determined by bootstrap analysis (Felsenstein 1985) with 1000 heuristic replicates with simple addition, TBR branch-swapping. For the parsimony analysis of 13 combined chloroplast DNA ORF regions, C. sativus and S. oleracea were used as outgroups to analyze the phylogenetic status of S. tuberosum among the other ingroup taxa of Solanaceae.
Results and discussion
Overall structure and gene content of S. tuberosum chloroplast DNA
The entire chloroplast DNA sequence of S. tuberosum was determined (GenBank accession no. DQ231562). It is circular double-stranded DNA molecule of 155,312 bp length, with typical quadripartite structure of most plastid genomes; and its gene map is shown in Fig. 1. S. tuberosum chloroplast DNA consists of a pair of inverted repeat regions of 25,595 bp that are separated by small single-copy (SSC) region of 18,373 bp, and a large single-copy (LSC) region of 85,749 bp. The GC content of chloroplast DNA is 37%, which is the same as other Solanaceae species. The S. tuberosum chloroplast genome contains a total of 130 genes, 17 of which are duplicated in IR. Thirty tRNAs and four rRNAs were identified. Seven tRNAs are duplicated in the IR, and 4 rRNAs are clustered and inversely oriented in the IR, as reported in other land plants. Eighteen genes contain one or two introns, and six introns are tRNAs. Four introns are located in IR and one intron in SSC.
Contraction and expansion of IRs
The borders between the inverted repeats (IRa and IRb) and the two single-copy regions (LSC and SSC) usually differ among various plant species. Large expansions or contractions of the inverted repeat regions often caused variation in length of chloroplast genomes. The IR of S. tuberosum is 25,595 bp long, which is almost similar to the length of S. lycopersicum (25,611 bp). But the IR is either 253 bp longer than the length of N. tabacum or 311 bp shorter than the length of A. belladonna. In Fig. 2, the exact IR border positions of five Solanaceae species, except N. sylvestris, are compared. The IR border position of N. sylvestris was same as that of N. tabacum. In all species, the border between IRa and SSC was located within the coding region of ycf1 gene and created the ycf1 pseudogenes at the IRb/SSC border with lengths as far as the IR expanded into the ycf1 gene. For example, S. tuberosum IR was far extended into the ycf1 gene, resulting in 1122 bp of the ycf1 pseudogene at the IRb/SSC border. Likewise, N. tabacum has 996 bp of ycf1 duplicated, whereas A. belladonna and N. tomentosiformis have 1438 and 1010 bp of the ycf1 pseudogene at the IRb/SSC border, respectively. The ndhF genes were located entirely within the SSC region, with various lengths of intergenic space. In S. tuberosum, the ndhF gene was located one base apart from the border, whereas 32 bp or more of the intergenic space before the ndhF gene was observed in the other species.
In addition to the IRa expansion into ycf1 gene, the IRb region extended into the rps19 gene was also found in S. tuberosum, S. esculentum, A. belladonna, and N. tomentosiformis, creating a duplication of various lengths of the 5′ end of rps19 gene at the IRa/LSC border. Both A. belladonna and N. tomentosiformis have 59 bp of the rps19 pseudogene, and S. tuberosum and S. esculentum have 69 and 91 bp of the rps19 pseudogene, respectively. It is interesting that there was no extension of the IRb region into rps19 gene in N. tabacum and N. sylvestris. The location of trnH gene was quite conserved among Solanaceae. In S. tuberosum, the IRa/LSC border was located 30 bp downstream of the non-coding region of trnH(GUG) gene, while the other four Solanaceae species had the IRa/LSC border located 2–6 bp of downstream of trnH(GUG) gene. Similar IR contraction and expansion has been analyzed in Glycine, Arabidopsis, Lotus, Panax, cucumber, wheat, rice, and maize (Kim et al. 2005; Kim and Lee 2004; Maier et al. 1995; Ogihara et al. 2002; Saski et al. 2005). This structural feature of borders between two IRs and two single-copy regions may due to intramolecular recombination between two short direct repeat sequences within the genes located at the borders (Maier et al. 1995).
Length mutations in the coding genes
Although the overall chloroplast genomic structures of Solanaceae species were quite similar, a complete alignment of chloroplast genome sequences revealed a reasonable number of indels in the protein-coding genes and intergenic spacer regions between Solanaceae species. Regarding the protein-coding genes, there was indels in five protein-coding regions when compared with other six Solanaceae species. Comparison of accD genes between seven Solanaceae species, two events of indel were observed. The deletion of 24 bp occurred in three Solanum species and A. belladonna, whereas insertion of 9 bp was found only in three Solanum species (Fig. 3A). Direct repeats of TAGT and ACATGT are associated with these indels. The accD gene encodes the beta-carboxyl transferase subunits of acetyl-CoA-carboxylase (ACCase) and is present in the plastids of most land plants. The tobacco ACCase is essential for leaf development, leaf longevity, and seed yield (Kode et al. 2005; Madoka et al. 2002). The accD gene located between rbcL and ycf4 has been progressively deleted among grass family (Calsa Junior et al. 2004; Maier et al. 1995). ORF106 is present only in rice as a remnant of accD gene, and a complete deletion of accD occurs in maize, wheat, and sugarcane.
Additional indels did take place in the ycf1 and ycf2 genes. Most higher plants contain these two genes, which appear to be essential for cell survival, as homoplastomic ycf1 deletion mutants of Chlamydomonas and tobacco could not be achieved (Boudreau et al. 1997; Drescher et al. 2000; Maier et al. 1995); although ycf1 in the chloroplast genome of maize and rice is reduced to a series of shorter reading frames by various deletions (Maier et al. 1995; Ogihara et al. 2002). Since the sequence comparison of ycf2 has been reported (Daniell et al. 2006), the comparison of ycf1 between Solanaceae species was considered in this study. A total of 13 indels occurred within ycf1; 3 of them that found in three Solanum species are depicted in Fig. 3A. Small indels of 6 and 12 bp were one of the short base repeats, ATTTTT and GTTTTT, whereas a 36-bp deletion was not associated with any repeat sequences.
Eighteen genes for six tRNAs and 12 proteins contain introns as described above. The indels in introns of ycf3 and trnK genes were identified. ycf3 contains two introns of 727 and 750 bp in length, whereas trnK has 2512 bp intron. Deletions of poly(T) tracks were found in both intron 1 and 2 of three Solanum species. In trnK intron of A. belladonna and S. bulbocastanum, a 18 bp was deleted at different position, whereas S. tuberosum and S. lycopersicum contained a 8-bp deletion in their introns.
Length mutations in the intergenic regions
Overall, the gene contents and their relative positions of S. tuberosum are similar to the other six Solanaceae species that we compared. The total length of S. tuberosum chloroplast DNA is the shortest among the Solanaceae plants reported so far, whereas A. belladonna contains the longest chloroplast genome. The LSC of S. tuberosum is 85,740 bp long, which is 937 and 1119 bp shorter than the length of A. belladonna and N. tabacum, respectively. This size difference is mainly due to large number of deletions in the LSC region. Multiple alignment of seven Solanaceae chloroplast genome sequences revealed that there are 64–69 deletions equal to or larger than 5 bp in the intergenic regions placed in the LSC region relative to N. tabacum. Three remarkable variable intergenic regions were identified in Solanaceae species. The intergenic regions locate between the tRNA cluster (trnC--trnfM), the trnL3′ and trnM, and rbcL and psbJ genes. Striking feature of S. tuberosum was a single large deletion of 452-bp in the intergenic region between the trnT and trnE genes compared with the other species (data not shown). Divergence of these indel regions is similarly identified in grass family wheat, maize, rice, and sugarcane (Asano et al. 2004; Calsa Junior et al. 2004; Maier et al. 1995; Ogihara et al. 2002). One or two large deletions of 240–800 bp in size are commonly found the intergenic regions between the trnT and trnG5 genes, and the psbM and trnD genes in the wheat and rice chloroplasts. Apart from the large deletions described above, the rice chloroplast also contains the large deletion at upstream of trnC (Ogihara et al. 2002). Thus, a comparison of our results with those of the grass family shows that the length mutation and its location within the tRNA cluster are quite divergent between dicots and monocots, and that the size divergence is even more extended among the grass family.
Detailed indels, at least over 6 bp, identified only in S. tuberosum were selected, and classified into two groups on the basis of the presence of direct repeats (Figs. 4 and 5). A single largest 543-bp insertion occurred especially in the intergenic region between ycf4 and cemA in S. tuberosum (Fig. 4). Direct repeats of TTGAGA were associated with this large insertion. A 508-bp insertion, corresponding to this largest insertion, was also found in S. lycopersicum and S. bulbocastanum. The variation in length polymorphism between ycf4 and cemA is similarly observed from wheat and its closely related species, Aegilops, by sequence comparison as well as restriction fragment pattern analysis (Guo and Terachi 2005; Ogihara et al. 2002; Ogihara et al. 1991). Ae. caudata has a 289-bp of large deletion in the intergenic region between ycf4 and cemA compared with that of Ae. mutica. In this case, a pair of direct repeats, “A4GAAGAA” is present in Ae. mutica and other related species.
Three indels, not flanked by direct repeats that showed the difference between S. tuberosum and S. bulbocastanum, were found (Fig. 5). One of them was a single largest 241-bp deletion in S. tuberosum. This deletion occurred in the intergenic region between ndhC and trnV. Interestingly, the 241-bp deleted sequence was absent in other land plants including Arabidopsis, spinach, rice, corn, and ginseng, but was present in A. belladonna and Nicotiana species. In N. tomentosiformis, a 41-bp deletion occurred within the corresponding region of the 241-bp deletion, which was associated with direct repeats, GGATAT. The cultivated potato, S. tuberosum, is a tetraploid and classified into Andigenum Group (S. tuberosum L. subsp. andigenum) and Chilotanum Group (S. tuberosum L. subsp. tuberosum) based on its origin. It has been shown that the presence of the 241-bp deletion is typical of Chilotanum Group (Kawagoe and Kikuta 1991; Spooner et al. 2005). Our study indicates that S. bulbocastanum, a diploid wild potato, shares the same gene pool with Andigenum Group, and that the 241-bp deletion clearly represents the genetic difference between the cultivated potato and wild potato. Other large deletions were found in the intergenic region between matK and rps16, and rpl18 and rpl20. The deletion position and its degree in the intergenic region between matk and rps16 were quite diverse even in Solanum species, whereas a 45-bp deletion between rpl18 and rpl20 occurred only in S. tuberosum. Relative to S. bulbocastanum, S. tuberosum chloroplast genome contained a 591 base substitution, 76 insertions, and 33 deletions including the indels described above, and most of them occurred in the LSC region. Taken together, our data obtained from the cultivated potato S. tuberosum could provide potential molecular markers to evaluate genetic diversity as well as evolutionary processes in potato landraces.
High divergence in annotated open reading frames between Solanaceae
In addition to defined genes, S. tuberosum chloroplast genome contains various ORFs like other higher plants. N. tabacum chloroplast genome harbors 17 of ORFs (Yukawa et al. 2006). The comparison of 16 ORFs, except ORF350 from seven Solanaceae species, is listed in Table 2. The ORFs located in IR were highly conserved in three Nicotiana species; but among them, three ORFs showed base substitutions and indels in three Solanum species and A. belladonna. In three Solanum species, ORF115 and ORF92 were reduced to either ORF89 or ORF48, and ORF66 or ORF54, respectively, mainly because of a 78-bp deletion. The ORF75 in A. belladonna contained premature stop codon due to base substitution. Unlike ORFs in IR, the ORFs in LSC showed remarkable divergences between N. tomentosiformis, A. belladonna, and three Solanum species. They are either reduced in size or fragmented or both due to indels and base substitution. The degree of changes in ORFs was quite variable between the species. For example, N. tabacum ORF90 was reduced to ORF64 in N. sylvestris due to 2-bp deletion, and to ORF25 in S. Tuberosum, S. lycopersicum and S. bulbocastanum due to early stop codon, whereas N. tomentosiformis contained ORF90 with three amino acid changes.
Considering high degree of gene conservation in chloroplast genomes, the ORFs seem to be rapidly diverged so that they can be used to differentiate closely related species. Therefore, 13 ORFs retrieved from 7 chloroplast genome sequences from Solanaceae species and two outgroups were phylogenetically analyzed. The pairwise distance matrix for sequence divergence and patristics was generated from combined 4499 sites (Table 3). Parsimony analysis revealed 2347 constant, 1490 variable, and 662 informative characters. Sequence divergences among the ingroup taxa (Solanaceae) ranged from 0.00 (N. tabacum and N. sylvestris) to 0.265 (S. lycopersicum and A. belladonna), while the values between ingroup and outgroup ranged from 0.1345 to 0.2048. Two maximum parsimonious (MP) trees were produced with 2582 tree length. The trees differed from each other only in the terminal resolution within Solanaceae. The consistent index and retention index in the MP tree were 0.939 and 0.830, respectively (Fig. 6). Seven ingroup taxa resolved as a monophyletic group from outgroup with 100% bootstrap value. The Solanaceae taxa formed two sister clades. The first united a Nicotiana subclade and its sister group of Atropa, and three Nicotiana species showed strong monophyly with 96% bootstrap value. In the second clade, a very strong monophyletic Solanum group, with 100% bootstrap value, was generated. This result on phylogenetic relationship among the Solanaceae would be reflected by indel patterns among the ORF sequences as well as by morphological features such as leaf, flower and fruit shape patterns.
Conclusion
The complete sequence of S. tuberosum chloroplast genome revealed extensive similarity to six Solaneceae species in terms of the gene content and structure, suggesting a common chloroplast evolutionary lineage within Solanaceae. However, many of the features considered as typical to S. tuberosum chloroplast DNA were found in the intergenic regions and ORFs, and a few in protein-coding genes, which can be used as molecular markers to study the genetic diversity and population-genetic processes in potato landraces. In particular, this study inferred plastid phylogeny and evolution on the basis of ORFs from seven Solanaceae. The phylogenetic analysis produced a robust support for the phylogenetic positions of the S. tuberosum and S. bulbocastanum among Solanaceae.
References
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Asano T, Tsudzuki T, Takahashi S, Shimada H, Kadowaki K (2004) Complete nucleotide sequence of the sugarcane (Saccharum officinarum) chloroplast genome: a comparative analysis of four monocot chloroplast genomes. DNA Res 11:93–99
Boudreau E, Turmel M, Goldschmidt-Clermont M, Rochaix JD, Sivan S, Michaels A, Leu S (1997) A large open reading frame (orf1995) in the chloroplast DNA of Chlamydomonas reinhardtii encodes an essential protein. Mol Gen Genet 253:649–653
Calsa Junior T, Carraro DM, Benatti MR, Barbosa AC, Kitajima JP, Carrer H (2004) Structural features and transcript-editing analysis of sugarcane (Saccharum officinarum L.) chloroplast genome. Curr Genet 46:366–373
Cosner ME, Raubeson LA, Jansen RK (2004) Chloroplast DNA rearrangements in Campanulaceae: phylogenetic utility of highly rearranged genomes. BMC Evol Biol 4:1471–2148
Daniell H, Lee SB, Grevich J, Saski C, Quesada-Vargas T, Guda C, Tomkins J, Jansen RK (2006) Complete chloroplast genome sequences of Solanum bulbocastanum, Solanum lycopersicum and comparative analyses with other Solanaceae genomes. Theor Appl Genet 1432–2242 (on line)
Drescher A, Ruf S, Calsa T Jr., Carrer H, Bock R (2000) The two largest chloroplast genome-encoded open reading frames of higher plants are essential genes. Plant J 22:97–104
Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791
Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20:406–416
Gordon D, Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Res 8:195–202
Guo CH, Terachi T (2005) Variations in a hotspot region of chloroplast DNAs among common wheat and Aegilops revealed by nucleotide sequence analysis. Genes Genet Syst 80:277–285
Jansen RK, Raubeson LA, Boore JL, dePamphilis CW, Chumley TW, Haberle RC, Wyman SK, Alverson AJ, Peery R, Herman SJ, Fourcade HM, Kuehl JV, McNeal JR, Leebens-Mack J, Cui L (2005) Methods for obtaining and analyzing whole chloroplast genome sequences. Methods Enzymol 395:348–384
Kato T, Kaneko T, Sato S, Nakamura Y, Tabata S (2000) Complete structure of the chloroplast genome of a legume, Lotus japonicus. DNA Res 7:323–330
Kawagoe Y, Kikuta Y (1991) Chloroplat DNA evolution in potato (Solanum tuberosum L.). Theor Appl Genet 81:13–20
Kim KJ, Choi KS, Jansen RK (2005) Two chloroplast DNA inversions originated simultaneously during the early evolution of the sunflower family (Asteraceae). Mol Biol Evol 22:1783–1792
Kim KJ, Lee HL (2004) Complete chloroplast genome sequences from Korean ginseng (Panax schinseng Nees) and comparative analysis of sequence evolution among 17 vascular plants. DNA Res 11:247–261
Kim KJ, Lee HL (2005) Widespread occurrence of small inversions in the chloroplast genomes of land plants. Mol Cells 19:104–113
Kode V, Mudd EA, Iamtham S, Day A (2005) The tobacco plastid accD gene is essential and is required for leaf development. Plant J 44:237–244
Madoka Y, Tomizawa K, Mizoi J, Nishida I, Nagano Y, Sasaki Y (2002) Chloroplast transformation with modified accD operon increases acetyl-CoA carboxylase and causes extension of leaf longevity and increase in seed yield in tobacco. Plant Cell Physiol 43:1518–1525
Maier RM, Neckermann K, Igloi GL, Kossel H (1995) Complete sequence of the maize chloroplast genome: gene content, hotspots of divergence and fine tuning of genetic information by transcript editing. J Mol Biol 251:614–628
Ogihara Y, Isono K, Kojima T, Endo A, Hanaoka M, Shiina T, Terachi T, Utsugi S, Murata M, Mori N et al. (2002) Structural features of a wheat plastome as revealed by complete sequencing of chloroplast DNA. Mol Genet Genomics 266:740–746
Ogihara Y, Terachi T, Sasakuma T (1991) Molecular analysis of the hot spot region related to length mutations in wheat chloroplast DNAs. I. Nucleotide divergence of genes and intergenic spacer regions located in the hot spot region. Genetics 129:873–884
Oharamays EP, Capwell JC (1993) Miniprep for chloroplast DNA isolation. Microchem J 47:245–250
Saski C, Lee SB, Daniell H, Wood TC, Tomkins J, Kim HG, Jansen RK (2005) Complete chloroplast genome sequence of glycine max and comparative analyses with other legume genomes. Plant Mol Biol 59:309–322
Schmitz-Linneweber C, Regel R, Du TG, Hupfer H, Herrmann RG, Maier RM (2002) The plastid chromosome of Atropa belladonna and its comparison with that of Nicotiana tabacum: the role of RNA editing in generating divergence in the process of plant speciation. Mol Biol Evol 19:1602–1612
Shahid Masood M, Nishikawa T, Fukuoka S, Njenga PK, Tsudzuki T, Kadowaki K (2004) The complete nucleotide sequence of wild rice (Oryza nivara) chloroplast genome: first genome wide comparative sequence analysis of wild and cultivated rice. Gene 340:133–139
Shinozaki K, Ohme M, Tanaka K, Wakasugi T, Hayashida N, Matsubayashi T, Zaita N, Cuhunwongse J, Obokata J, Yamaguchi-Shinozaki K et al. (1986) The complete nucleotide sequence of the tobacco chloroplast genome: Its gene organization and expression. EMBO J 5:2043–2049
Spooner DM, Nunez J, Rodriguez F, Naik PS, Ghislain M (2005) Nuclear and chloroplast DNA reassessment of the origin of Indian potato varieties and its implications for the origin of the early European potato. Theor Appl Genet 110:1020–1026
Sugiura C, Kobayashi Y, Aoki S, Sugita C, Sugita M (2003) Complete chloroplast DNA sequence of the moss Physcomitrella patens: evidence for the loss and relocation of rpoA from the chloroplast to the nucleus. Nucleic Acids Res 31:5324–5331
Sugiura M, Hirose T, Sugita M (1998) Evolution and mechanism of translation in chloroplasts. Annu Rev Genet 32:437–459
Swofford DL (1998) PAUP*—Phylogenetic analysis using parsimony, ver. 4.0 beta 10. Sunderland: Sinauer Associates
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The ClustalX windows interface: flexible strategies for multiple sequences alignment aided by quality analysis tools. Nucleic Acid Res 24:4876–4882
Wyman SK, Jansen RK, Boore JL (2004) Automatic annotation of organellar genomes with DOGMA. Bioinformatics 20:3252–3255
Yukawa M, Tsudzuki T, Sugiura M (2006) The chloroplast genome of Nicotiana sylvestris and Nicotiana tomentosiformis: complete sequencing confirms that the Nicotiana sylvestris progenitor is the maternal genome donor of Nicotiana tabacum. Mol Genet Genomics:1–7
Acknowledgements
This work was supported by a grant to JRL from the Crop Functional Genomics Center of the 21st Century Frontier Research Program, a grant to JRL from the Korea Science and Engineering Foundation through the Plant Metabolism Research Center of the Kyung Hee University funded by the Korea Ministry of Science and Technology, and a grant to JRL from the Marine and Extreme Genome Research Center Program funded by the Korean Ministry of Marine Affairs and Fisheries.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by I. S. Chung
Rights and permissions
About this article
Cite this article
Chung, HJ., Jung, J.D., Park, HW. et al. The complete chloroplast genome sequences of Solanum tuberosum and comparative analysis with Solanaceae species identified the presence of a 241-bp deletion in cultivated potato chloroplast DNA sequence. Plant Cell Rep 25, 1369–1379 (2006). https://doi.org/10.1007/s00299-006-0196-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00299-006-0196-4