Abstract
Comparative chloroplast genome analysis presents new opportunities for performing molecular phylogeny studies and revealing the significant evolutionary features in higher plants, which has been widely documented from conifers to grass family. However, a systematic analysis of chloroplast genomes in Asteraceae family has not been conducted up to now. In this study, we compared and analyzed the gene content, genomic organization, and RNA editing sites of eight representative Asteraceae chloroplast genomes. Results showed that Asteraceae chloroplast had relatively conservative gene content. No gain or loss events occurred in the protein-coding genes, while some differences were found to be present in the gene structure and transfer RNA (tRNA) abundance. Genome structure analysis found some Asteraceae-specific or species-specific structure variations, and sequence rearrangement events were present in these genomes, suggesting specific evolutionary processes have occurred in this family. Some DNA regions containing parsimony-informative characters higher than 5 % were also identified, which could be used as the new molecular markers for phylogenetic analysis and plant identification of Asteraceae species. Furthermore, RNA editing in these genomes was investigated through computational analysis, and some species-specific sites were identified. Finally, phylogenetic analysis of 81 genes from 70 species supported the monophyly of the Asteraceae. Our study for the first time compared the organization, structure, and sequence divergence of eight Asteraceae chloroplast genomes, which will provide the valuable resource for molecular phylogeny of Asteraceae species and also facilitate the genetic and evolutionary studies in this family.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
The Asteraceae is one of the largest families of flowering plants, consisting of more than 23,000 species across 1620 genera (Bremer 1994). Species in this family are distributed globally, which are present from the polar regions to the tropics and can adapt to various natural environment and habitats (Funk et al. 2005). Although a remarkable number are shrubs, trees, and vines, members in Asteraceae family are mostly herbaceous. It has been documented that Asteraceae species displayed high variation in morphological, physiological, and biochemical traits, such as secondary chemistry (Carlquist 1976) and chromosome numbers (Barker et al. 2008). The phenotypic and species diversity in this family provides the ideal resource to study the species diversification and phylogenetic evolution. Furthermore, this family also has a notable economical and ecological significance, which includes the members of important oil crops, herbal plants, ornamentals, and horticultural species as well as some hazardous invasive weeds worldwide (Lundberg and Bremer 2003).
Chloroplasts (cp) are organelles in eukaryotic autotrophic organism cell, whose main role is to carry out photosynthesis and supply the essential energy for all the organisms living in the world. They have their own genomic and genetic system to replicate and transcribe their own genome and usually exhibit maternal inheritance. In land plants, the cp genome is highly conserved (Chumley et al. 2006), with a typically circular organization containing a pair of inverted repeat regions (IRs) separated by a large single-copy (LSC) and a small single-copy (SSC) region (Palmer 1991). The highly conservative structure of cp genome makes it suitable for comparative analysis from distant and closely related species (Raubeson and Jansen 2005). It has been demonstrated that comparative analysis of cp genomes could not only provide the valuable information to understand the structure and organization variations of cp genomes but also facilitate to reveal the molecular evolution and diversification process of plants (Liu et al. 2013; Rivas et al. 2002).
With the emergence and development of high-throughput sequencing technology, more and more chloroplast genomes of Asteraceae species have been dissected recently (Nie et al. 2012; Liu et al. 2013; Bock et al. 2014). The available completed genomes provide an opportunity to perform comparative analysis within these members of Asteraceae family at the whole chloroplast genome level. Although extensive studies have reported the comparative chloroplast genome analysis in many families (Wu et al. 2011; Ghimiray and Sharma 2014), a systematic analysis of the structure and organization of chloroplast genomes in Asteraceae family has not been performed up to now. Here, eight chloroplast genomes belonging to different tribes of Asteraceae family are selected as the representative to perform comparative analysis, including Ageratina adenophora (NC_015621) belonging to the tribe Eupatorieae (Nie et al. 2012), Artemisia frigida (NC_020607) (Liu et al. 2013), Chrysanthemum indicum (NC_020320), and Chrysanthemum × morifolium (NC_020092) all belonging to the tribe Anthemideae, Guizotia abyssinica (NC_010601) (Dempewolf et al. 2010) belonging to the tribe Millerieae, Helianthus annuus (NC_007977) (Timme et al. 2007) belonging to the tribe Heliantheae and Jacobaea vulgaris (NC_015543) (Doorduin et al. 2011) belonging to the Senecioneae clade as well as Lactuca sativa (DQ383816) (Timme et al. 2007) belonging to the tribe Lactucinae, with the purpose to compare and analyze the gene content, genome structure, and RNA editing of Asteraceae cp genomes, which will provide helpful information for better understanding of cp genome evolution within this family.
Materials and Methods
Comparison of Genome Content and Organization of Asteraceae cp Genomes
Complete cp genome sequences of these eight Asteraceae species were downloaded from The Chloroplast Genome Database (http://chloroplast.ocean.washington.edu/). The gene content and genome organization information was obtained from the available annotated files. The sequences of the remaining seven genomes were aligned with that of H. annuus using the program ClustalW2 (https://www.ebi.ac.uk/Tools/msa/clustalw2/) with default settings to identify the position of LSC, SSC, and IR regions, respectively. The size, gene content, order, and organization of all the eight Asteraceae cp genomes were compared with each other manually.
Structural Comparison of Asteraceae cp Genomes
The gene order conservation (dot-plot analyses) among the eight Asteraceae cp genomes were visualized using the program Mulan (http://mulan.dcode.org/) (Ovcharenko et al. 2005). The alignments were performed with the annotation information of A. adenophora cp genome which was represented as reference. The evolutionarily conserved sequences were detected using a threshold of at least 70 % identity over 100 bps. At the same time, the structural variations of the Asteraceae cp genomes were further compared using the Mauve software (Darling et al. 2004) with the cp genome of Nicotiana tabacum (NC_001879) (Shinozaki et al. 1986) as the reference sequence.
Sequence Variation Analysis and Marker Identification
The program mVISTA (http://genome.lbl.gov/vista/mvista/) (Frazer et al. 2004) in Shuffle-LAGAN mode was used to perform the structural comparison of A. adenophora cp genome with the remaining seven Asteraceae cp genomes with the sequence annotation information of A. adenophora.
Then, the intergenic regions with high sequence diversity were selected for phylogenetic analyses among the eight Asteraceae species. All the sequences were downloaded from GenBank. ClustalW tool integrated into BioEdit software (Hall 1999) was utilized to align the concatenated sequences of each region which were subsequently edited manually. PAUP* (Swofford 2002) was used to construct maximum parsimony (MP) trees with 1000 bootstrap replicates to estimate MP branch support values.
Prediction of the RNA Editing Sites in the Asteraceae cp Genomes
All the protein-coding genes of the eight Asteraceae cp genomes were downloaded from the NCBI database (www.ncbi.nlm.nih.gov) according to their annotation information. Predictive RNA editor for plants (PREP)-Cp tool (http://prep.unl.edu/cgi-bin/cp-input.pl) (Mower 2009) was used to predict the RNA editing sites in these chloroplast genes with default parameters. To validate the prediction, the expressed sequence tags (EST) sequences of C. × morifolium, G. abyssinica, H. annuus, and L. sativa were obtained from the Compositae Genome Project Database (http://compgenomics.ucdavis.edu/) and then the protein-coding genes of these four species were searched for similarity by BLAST tools against their respective EST database. Significant hits were examined manually, and only the base-pair differences C to T were considered as the RNA edited sites.
Phylogenetic Analysis
A set of 81 cp genes from 70 taxa including these eight Asteraceae species, 59 other angiosperm lineages, and three gymnosperms were used to infer the phylogenetic relationships among them (Online Resource 1_ESM_1 and Online Resource 1_ESM_2). The database MSWAT (http://mswat.ccbb.utexas.edu/) (Jansen et al. 2007) was used to generate an alignment of these sequences. Maximum likelihood (ML) searches were performed to find best trees with RAxML-HPC BlackBox which was accessible from the CIPRES Science Gateway (Miller et al. 2010). We estimated proportion of invariable sites (GTRGAMMA + I) and let RAxML halt bootstrapping automatically (as highly recommended by the online program). The PAUP* (Swofford 2002) was used to conduct phylogenetic analysis using maximum parsimony (MP) with the above parameters. Nuphar advena and Nymphaea alba were served as outgroups in both ML and MP searches.
Results and Discussion
Comparison of Genome Content and Organization of Asteraceae cp Genomes
GC (guanine-cytosine, G+C) content was firstly evaluated across the eight Asteraceae cp genomes. It was found that there was little variation in the GC content of complete genome with the values ranging from 37.3 to 37.6 %. The GC content of all the protein-coding genes was also calculated, and no significant variation was observed among these genomes. In addition, the cp genome of J. vulgaris was found to have the lowest GC content in both complete genome and coding regions, while H. annuus and G. abyssinica had the highest genome content in complete genome, and A. adenophora had the highest GC content in coding regions.
Then, the genome sizes of the eight Asteraceae cp genomes were compared and analyzed. With an average size of 151 kb, the cp genomes varied from 150,686 to 152,772 bp in length (Table 1). All the eight genomes contained two copies of IR regions, with the range from 23,755 to 25,034 bp. The IRs were separated by a LSC region (82,718 ∼ 84,829 bp) and a SSC region with the size of about 18 kb, respectively. The sizes of complete genome, SSC region, and IRs of L. sativa were the largest among the eight Asteraceae species, while A. adenophora had the largest LSC region and the smallest IR regions. It has been reported that the genome size variation of cp genome was due to the length difference of LSC and IR regions (Chung et al. 2006), while in Asteraceae family, the size of intergenic regions contributed mainly to the variation of cp genome size.
The gene content of the eight Asteraceae cp genomes was further investigated and compared. Results showed that they were conservative although some minor variation was present. The Asteraceae cp genomes contained approximately 114 unique genes, including about 80 protein-coding genes, 29 transfer RNA (tRNA) genes or so, and 4 ribosomal RNA (rRNA) genes (rrn4.5, rrn5, rrn16 and rrn23, all duplicated in the IRs). H. annuus, A. adenophora, J. vulgaris, and L. sativa shared the same 81 protein-coding genes. The number was larger than that of A. frigida, which is absence of psbJ gene, as well as C. indicum, C. × morifolium, and G. abyssinica, all lacking ycf15 gene (not duplicated by the IRs). Protein-coding genes made up 48.8 % (A. adenophora and C. indicum) to 52.1 % (A. frigida) of the genome and the remaining were tRNA genes, rRNA genes, introns, intergenic spacers, and pseudogenes.
In cp genome of higher plants, the rps12 gene is trans-spliced, with one of its exons located in the LSC region and the other duplicated by the IRs (Howe et al. 2003). In this study, we also found that the all the eight Asteraceae cp genomes were in accordance with this feature. A total of 17 intron-containing genes were found in the Asteraceae cp genomes, including 11 protein-coding genes and six tRNA genes, almost all of which were single-intron genes with the exception of ycf3 and clpP, both having two introns in A. adenophora, G. abyssinica, H. annuus, J. vulgaris, and L. sativa. However, rpoC1 also had two introns in A. adenophora. Compared with H. annuus, there were 12 protein-coding genes and five tRNA genes, equally making to a total of 17 intron-containing genes present in C. × morifolium plastome. It had another two-exon gene, ycf2, but lacked the trnG-UCC with one intron. In addition, A. frigida and C. indicum also had another two-exon gene, ycf2 and rpl16, respectively.
Furthermore, we found that there were differences in the numbers, types, and relative positions of tRNA genes except the shared 24 (seven tRNA genes duplicated in the IRs) in the Asteraceae cp genomes as shown in the table (Online Resource 1_ESM_3). Apart from A. frigida and J. vulgaris, the rest contained two trnS-GCU genes with minor variances in length. A. adenophora, A. frigida, C. indicum, C. × morifolium, G. abyssinica, and J. vulgaris contained two trnG-UCC genes and one of them had an intron, while there was no trnE-UUC in A. adenophora, C. indicum, C. × morifolium, and G. abyssinica. Furthermore, trnG-UUC did not exist between trnR-UCU and trnT-GGU in C. × morifolium.
Comparison of the Genome Structure in Asteraceae cp Genomes
Previous studies have demonstrated that the Asteraceae cp genomes shared a large 22.8 kb inversion and a smaller 3.3 kb inversion nested within the region of the large one (Kim et al. 2005). In our study, both of the two inversion events were found in the plastid genome of all the eight Asteraceae species, with respect to N. tabacum (NC_001879) (Shinozaki et al. 1986) (Fig. 1). It is estimated that the two inversions may be present in all Asteraceae species as a specific feature of the Asteraceae cp genomes. The two inversions always appear together, suggesting that they occurred at the same evolutionary process.
Besides, the cp genome of A. adenophora was compared with that of the remaining seven species using the program Mulan (http://mulan.dcode.org/) for dot-plot analyses (Ovcharenko et al. 2005). As a result, we found that except A. frigida, there were no substantial rearrangements among the eight species of Asteraceae family, which were entirely collinear other than numerous small deletions and insertions (Fig. 2 and Online Resource 1_ESM_4). The SSC region of A. frigida is inverted, compared with other Asteraceae cp genomes. Thus, the gene order was further analyzed in the SSC region (Fig. 3) for these Asteraceae cp genomes. Results indicated that apart from A. frigida, the Asteraceae family had the same gene order in the SSC region. It began with pseudo ycf1 and then was followed by the order of rps15, ndhH, ndhA, ndhI, ndhG, ndhE, psaC, ndhD, ccsA, trnL-UAG, and rpl32 and ended with ndhF, which was completely reversed compared to N. tabacum. Nevertheless, the gene order of A. frigida were the same with that of N. tabacum, which ended with pseudo ycf1, extending into IRa region. These results suggested that the Asteraceae family may occur an inversion event in the SSC region before the divergence of species within it and then the SSC region of A. frigida lineage occurred a re-inversion event to make it have the same gene order with N. tabacum subsequently. It has been demonstrated sequence rearrangements were generally the results of recombination events (Ogihara et al. 1988). The structural variation and sequence rearrangements of the Asteraceae cp genomes will provide the vital resource for molecular evolution and phylogenetic studies.
Furthermore, the exact borders between the IR regions and the two single-copy regions (LSC and SSC) among the eight Asteraceae cp genomes were compared to investigate the contraction or expansion of the IR regions (Fig. 3). In all the eight species, the border between IRb and LSC was located in the region of rps19 gene and resulted in a pseudogene at the end of IRa with the same length as far as the IRb expanded into the rps19 gene. The IRb of A. adenophora, C. indicum, C. × morifolium, G. abyssinica, and H. annuus extended approximately 100 bp into the rps19 gene, whereas A. frigida, J. vulgaris, and L. sativa had 60, 41, and 60 bp of the rps19 pseudogene at the end of IRa, respectively. In addition to the expansion into rps19 gene, the IRb region extended into the ycf1 gene was also found in the eight species except A. frigida, whose IRa region extended into the ycf1 gene because of the re-inversion event. This created a duplication of various lengths (404 ∼ 576 bp) of the ycf1 gene at the beginning of IRa region (at the end of IRb region for A. frigida). The incomplete duplications of the normal copy of rps19 and ycf1 led to a shortage of protein-coding ability. The trnH gene was located entirely within the LSC region, with various distances from the IRa/LSC border. The C. indicum has 26 bp, the longest intergenic space among these species, while H. annuus has only 2 bp. Apart from A. frigida, the ndhF gene of the eight Asteraceae cp genomes was located 0 ∼ 209 bp upstream of the IRa/SSC border, while that of A. frigida had the ndhF gene located 75 bp of downstream of the IRb/SSC border. The contraction or expansion of the IR regions may result from intramolecular recombination between two short direct repeat sequences within the genes located at the borders between the IR regions and the two single-copy regions (Maier et al. 1995).
Sequence Divergence in Asteraceae cp Genomes and Marker Identification
The completed Asteraceae cp genomes offered the opportunity to perform sequence variation analysis within the family at the whole cp genome level. Regions with highly sequence variations among the eight species were calculated and visualized using mVISTA (http://genome.lbl.gov/vista/mvista/) (Frazer et al. 2004) programs. The result demonstrated that the non-coding region was more divergent than the coding region. The coding regions with the highest nucleotide divergence in the eight genomes were scattered across the whole genome, including ycf1, ndhK, rps16, rps3, rpl22, ccsA, matK, rpoC1, and accD (Fig. 4). Some intergenic regions containing high sequence variations were also found.
To identify some new regions which could be applied to Asteraceae phylogenetic analysis, eight intergenic regions with high sequence diversity and their combined region were extracted from these genomes to perform phylogenetic analysis using maximum parsimony (MP) method (Table 2 and Fig. 5). Results found that the eight markers contained parsimony-informative characters higher than 5 %. MP analysis resulted in corresponding single trees with consistency index ranging from 0.9020 ∼ 1 and retention index ranging from 0.8148 ∼ 1. Analysis of all eight region combined sequences generated a congruent topology with high support for four internal nodes. Seven regions (cssA-trnL, psbI-trnS, rpl33-rps18, trnF-ndhJ, trnG-trnT, trnH-psbA, and trnT-trnL) possessed the completely congruent trees with the life history of species. Among them, trnH-psbA has been frequently applied as a phylogenetic maker for Asteraceae family (Doorduin et al. 2011), and psbI-trnS has also been identified by Nie et al. (2012). The remaining regions are newly identified to be used for developing molecular markers for phylogenetic analysis in Asteraceae family.
Prediction of the RNA Editing Sites in the Asteraceae cp Genomes
RNA editing is one of the most important post-transcriptional processes in eukaryotic organisms, which could alter the transcripts through nucleotide insertion, deletion, or substitution to enrich the genetic information. Identification of the RNA editing sites in chloroplasts will not only provide the vital information on the proper function of the proteins encoded by plastids but also reveal the evolutionary features of RNA editing (Tillich et al. 2006; Nie et al. 2014). To investigate the RNA editing in Asteraceae plastids, we systematically analyzed and compared the RNA editing sites in the eight Asteraceae cp genomes using the computational analysis approach (Table 3). A total of 373 editing sites were found in these cp genomes, with the average number of 47 sites every species. Further analysis found that all the editing sites were C to U conversion, which was consistent with the previous observations in seed plant plastids (Tillich et al. 2006; Chen et al. 2011). Among them, 42 editing sites in 19 genes were identified in A. adenophora, 50 sites in 21 genes in A. frigida, 49 sites in 21 genes in C. indicum, and C. × indicum, 45 sites in 19 genes in G.abyssinica, 46 sites in 20 genes in H. annuus, 44 sites in 20 genes in J. vulgaris, as well as 48 sites in 19 genes in L. sativa, respectively. Of which, 11 sites in G.abyssinica and 6 sites in H. annuus were validated by EST alignment analysis. Furthermore, we compared the RNA editing sites patterns of these plastids, and 26 sites in 12 genes were found to be shared by the eight Asteraceae cp genomes. It has been documented that the number of shared editing sites increased in closely related taxa (Chen et al. 2011). In this study, A. frigida was found to share more editing sites with C. indicum and C. × indicum compared to other species, and A. adenophora shared more editing sites with G. abyssinica and H. annuus, suggesting that the RNA editing is evolutionary conserved. Although the Asteraceae plastids appeared to have similar pattern of RNA editing, some species-specific editing sites were also found, such as rpoC2-2 site (only identified in L. sativa), which suggested that some specific evolutionary features of RNA editing were present within the genus and subfamily level in Asteraceae family.
Phylogenetic Analysis
Firstly, 81 genes were extracted from the cp genomes of A. adenophora, A. frigida, C. indicum, C. × morifolium, G. abyssinica, and J. vulgaris and then uploaded to the database MSWAT for sequence alignment. After gaps removed, a total of 62,531 characters were remained in the final dataset. MP analysis generated a single tree with a length of 169,196; a consistency index of 0.4081; and a retention index of 0.6023 (Fig. 6). Bootstrap analysis indicated that 54 of 67 nodes were supported by values 95 % and 45 of these with bootstrap values of 100 %. ML analysis of the dataset produced similar phylogenetic topologies with MP trees (Online Resource 1_ESM_5). It is observed that A. frigida, C. indicum, and C. × morifolium are falling into the tribe Anthemideae in the Asteroideae subfamily. A. adenophora, G. abyssinica, and H. annuus are clustered into the Heliantheae alliance of Asteroideae. As for the remaining two species, J. vulgaris is grouped into the tribe Senecioneae of Asteroideae, and L. sativa is located in the tribe Cichorieae in the subfamily Cichorioideae. The phylogeny obtained with the molecular data is comparable to the taxonomy based on phenotypic characteristics. The eight species in the Asteraceae family are clustered into Asterales and placed within the euasterids II, which supports a monophyly of the Asteraceae.
References
Barker MS, Kane NC, Matvienko M, Kozik A, Michelmore RW, Knapp SJ, Rieseberg LH (2008) Multiple paleopolyploidizations during the evolution of the compositae reveal parallel patterns of duplicate gene retention after millions of years. Mol Biol Evol 25:2445–2455. doi:10.1093/molbev/msn187
Bock DG, Kane NC, Ebert DP, Rieseberg LH (2014) Genome skimming reveals the origin of the Jerusalem artichoke tuber crop species: neither from Jerusalem nor an artichoke. New Phytol 201:1021–1030
Bremer K (1994) Asteraceae: cladistics and classification. Timber Press, Portland
Carlquist S (1976) Tribal interrelationships and phylogeny of the Asteraceae. Aliso 8:465–492
Chen H, Deng L, Jiang Y, Lu P, Yu J (2011) RNA editing sites exist in protein-coding genes in the chloroplast genome of Cycas taitungensis. J Integr Plant Biol 53:961–970. doi:10.1111/j.1744-7909.2011.01082.x
Chumley TW, Palmer JD, Mower JP, Fourcade HM, Calie PJ, Boore JL, Jansen RK (2006) The complete chloroplast genome sequence of Pelargonium × hortorum: organization and evolution of the largest and most highly rearranged chloroplast genome of land plants. Mol Biol Evol 23:2175–2190. doi:10.1093/molbev/msl089
Chung HJ, Jung JD, Park HW, Kim JH, Cha HW, Min SR, Jeong WJ, Liu JR (2006) The complete chloroplast genome sequences of Solanum tuberosum and comparative analysis with Solanaceae species identified the presence of a 241-bp deletion in cultivated potato chloroplast DNA sequence. Plant Cell Rep 25:1369–1379. doi:10.1007/s00299-006-0196-4
Darling ACE, Mau B, Blattner FR, Perna NT (2004) Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14:1394–1403. doi:10.1101/gr.2289704
Dempewolf H, Kane NC, Ostevik KL et al (2010) Establishing genomic tools and resources for Guizotia abyssinica (L.f.) Cass.—the development of a library of expressed sequence tags, microsatellite loci, and the sequencing of its chloroplast genome. Mol Ecol Resour 10:1048–1058. doi:10.1111/j.1755-0998.2010.02859.x
Doorduin L, Gravendeel B, Lammers Y, Ariyurek Y, Chin-A-Woeng T, Vrieling K (2011) The complete chloroplast genome of 17 individuals of pest species Jacobaea vulgaris: SNPs, microsatellites and barcoding markers for population and phylogenetic studies. DNA Res 18:93–105. doi:10.1093/dnares/dsr002
Frazer KA, Pachter L, Poliakov A, Rubin EM, Dubchak I (2004) VISTA: computational tools for comparative genomics. Nucleic Acids Res 32:W273–W279. doi:10.1093/nar/gkh458
Funk VA, Bayer RJ, Keeley S et al (2005) Everywhere but Antarctica: using a supertree to understand the diversity and distribution of the Compositae. In: Friis I, Balslev H (eds) Plant diversity and complexity patterns: local, regional and global dimensions, the Royal Danish Academy of Sciences and Letters in Copenhagen. Denmark, Kgl. Danske Videnskabernes Selskab, pp 343–373
Ghimiray D, Sharma BC (2014) Comparative and bioinformatics analyses of the solanaceae chloroplast genomes: plastome organization is more or less conserved at family level. J App Biol Biotech 3:021–026. doi:10.7324/JABB.2014.2305
Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl Acids Symp 41:95–98
Howe CJ, Barbrook AC, Koumandou VL, Nisbet RER, Symington HA, Wightman TF (2003) Evolution of the chloroplast genome. Philos Trans R Soc Lond B Biol Sci 358:99–107. doi:10.1098/rstb.2002.1176
Jansen RK, Cai ZQ, Raubeson LA et al (2007) Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc Natl Acad Sci 104:19369–19374. doi:10.1073/pnas.0709121104
Kim K-J, Choi K-S, Jansen RK (2005) Two chloroplast DNA inversions originated simultaneously during the early evolution of the sunflower family (Asteraceae). Mol Biol Evol 22:1783–1792. doi:10.1093/molbev/msi174
Liu Y, Huo N, Dong L et al (2013) Complete chloroplast genome sequences of Mongolia medicine Artemisia frigida and phylogenetic relationships with other plants. PLoS ONE 8:e57533. doi:10.1371/journal.pone.0057533
Lundberg J, Bremer K (2003) A phylogenetic study of the order Asterales using one morphological and three molecular data sets. Int J Plant Sci 164:553–578. doi:10.1086/374829
Maier RM, Neckermann K, Igloi GL, Kössel H (1995) Complete sequence of the maize chloroplast genome: gene content, hotspots of divergence and fine tuning of genetic information by transcript editing. J Mol Biol 251:614–628. doi:10.1006/jmbi.1995.0460
Miller MA, Pfeiffer W, Schwartz T (2010) Creating the CIPRES Science Gateway for inference of large phylogenetic trees. In: Gateway Computing Environments Workshop (GCE), 14 Nov. 2010, New Orleans, LA, pp 1-8. doi:10.1109/GCE.2010.5676129
Mower JP (2009) The PREP suite: predictive RNA editors for plant mitochondrial genes, chloroplast genes and user-defined alignments. Nucleic Acids Res 37:W253–W259. doi:10.1093/nar/gkp337
Nie X, Lv SZ, Zhang YX et al (2012) Complete chloroplast genome sequence of a major invasive species, crofton weed (Ageratina adenophora). PLoS ONE 7:e36869. doi:10.1371/journal.pone.0036869
Nie XJ, Deng PC, Feng KW et al (2014) Comparative analysis of codon usage patterns in chloroplast genomes of the Asteraceae family. Plant Mol Biol Report 32:828–840
Ogihara Y, Terachi T, Sasakuma T (1988) Intramolecular recombination of chloroplast genome mediated by short direct-repeat sequences in wheat species. Proc Natl Acad Sci 85:8573–8577
Ovcharenko I, Loots GG, Giardine BM, Hou M, Ma J, Hardison RC, Stubbs L, Miller W (2005) Mulan: multiple-sequence local alignment and visualization for studying function and evolution. Genome Res 15:184–194. doi:10.1101/gr.3007205
Palmer JD (1991) Plastid chromosomes: structure and evolution. In: Hermann RG (ed) The molecular biology of plastids, vol 7A, Cell culture and somatic cell genetics of plants. Springer, Vienna, pp 5–53
Raubeson LA, Jansen RK (2005) Chloroplast genomes of plants. In: Herry RJ (ed) Plant diversity and evolution: genotypic and phenotypic variation in higher plants. CABI Publising, Wallingford, pp 45–68
Rivas JDL, Lozano JJ, Ortiz AR (2002) Comparative analysis of chloroplast genomes: functional annotation, genome-based phylogeny, and deduced evolutionary patterns. Genome Res 12:567–583. doi:10.1101/gr.209402
Shinozaki K, Ohme M, Tanaka M et al (1986) The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J 9:2043–2049
Swofford DL (2002) PAUP* Phylogenetic Analysis Using Parsimony (*and Other Methods), Version 4. Sinauer Associates, Sunderland, Massachusetts
Tillich M, Lehwark P, Morton BR, Maier UG (2006) The evolution of chloroplast RNA editing. Mol Biol Evol 23:1912–1921
Timme RE, Kuehl JV, Boore JL, Jansen RK (2007) A comparative analysis of the Lactuca and Helianthus (Asteraceae) plastid genomes: identification of divergent regions and categorization of shared repeats. Am J Bot 94:302–312. doi:10.3732/ajb.94.3.302
Wu CS, Lin CP, Hsu CY, Wang RJ, Chaw SM (2011) Comparative chloroplast genomes of pinaceae: insights into the mechanism of diversified genomic organizatio ns. Genome Biol Evol 3:309–319
Acknowledgments
This research was mainly funded by the National Basic Research Program of China (973 Program) (Grant No. 2009CB119200) and the National Natural Science Foundation of China (Grant No. 31471825) and partially supported by 948 Program (Grant No. 2010-S1), Ministry of Agriculture of China and the Open Project Program (Grant No. SKLOF201314), State Key Laboratory for Biology of Plant Diseases and Insect Pest.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Mengxing Wang and Licao Cui contributed equally to this work.
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
ESM 1
(DOC 192 kb)
ESM_4
Dot-plot comparison showing conserved and inverted regions found in two Chrysanthemum species, Ageratina, Guizotia, Helianthus, Jacobaea and Lactuca cp genomes (JPEG 941 kb)
ESM_5
Phylogenetic tree reconstruction of 70 taxa using maximum likelihood (ML) based on concatenated sequence from 81 cp genes. The position of the Asteraceae family is indicated by a red box (JPEG 1697 kb)
Rights and permissions
About this article
Cite this article
Wang, M., Cui, L., Feng, K. et al. Comparative Analysis of Asteraceae Chloroplast Genomes: Structural Organization, RNA Editing and Evolution. Plant Mol Biol Rep 33, 1526–1538 (2015). https://doi.org/10.1007/s11105-015-0853-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11105-015-0853-2