Abstract
Background
Wheat, barley, and rye, of tribe Triticeae in the Poaceae, are among the most important crops worldwide but they present many challenges to genomics-aided crop improvement. Brachypodium distachyon, a close relative of those cereals has recently emerged as a model for grass functional genomics. Sequencing of the nuclear and organelle genomes of Brachypodium is one of the first steps towards making this species available as a tool for researchers interested in cereals biology.
Findings
The chloroplast genome of Brachypodium distachyon was sequenced by a combinational approach using BAC end and shotgun sequences derived from a selected BAC containing the entire chloroplast genome. Comparative analysis indicated that the chloroplast genome is conserved in gene number and organization with respect to those of other cereals. However, several Brachypodium genes evolve at a faster rate than those in other grasses. Sequence analysis reveals that rice and wheat have a ~2.1 kb deletion in their plastid genomes and this deletion must have occurred independently in both species.
Conclusion
We demonstrate that BAC libraries can be used to sequence plastid, and likely other organellar, genomes. As expected, the Brachypodium chloroplast genome is very similar to those of other sequenced grasses. The phylogenetic analyses and the pattern of insertions and deletions in the chloroplast genome confirmed that Brachypodium is a close relative of the tribe Triticeae. Nevertheless, we show that some large indels can arise multiple times and may confound phylogenetic reconstruction.
Findings
Plastids are key organelles of green plants, carrying out functions like photosynthesis, starch storage, nitrogen and sulfate metabolism, and synthesis of chlorophyll, carotenoids, fatty acids and nucleic acids [1]. Plastids have multiple copies of a circular, double-stranded DNA chromosome, each with a set of approximately 110 genes highly conserved in sequence and organization [2].
In addition to their important biological roles, plastids have the potential to make a big impact on biotechnology. Plastid transformation, achieved via homologous recombination, is very advantageous compared to nuclear genome transformation mainly because it can generate high levels of gene expression and the recombinant DNA is more easily contained since chloroplasts are maternally inherited in most species of angiosperms [3].
The family Poaceae, with approximately 10,000 species, contains the world's most important crops. The tribe Triticeae, of subfamily Pooideae, includes species grown in temperate regions, some of which are of great economic importance; i.e., wheat, rye, triticale, and barley. Despite their contribution to human food supply, members of the Triticeae are not easily amenable to functional genomics aimed at crop improvement because of their large genome size and difficulty in transformation.
Brachypodium distachyon, a small grass in the Pooideae, has recently emerged as a new model species for functional genomics of temperate grasses. Brachypodium offers many advantages as a model grass; among them, its reduced stature, short life cycle, and small genome [4].
In the last few years a considerable effort has been made to develop genetic and molecular tools for Brachypodium, including ESTs [5], Bacterial Artificial Chromosome (BAC) libraries [6], cytological characterization of accessions [7–9], and techniques to perform rapid and efficient transformation [10, 11]. Finally, sequencing of the Brachypodium distachyon genotype Bd21 has been initiated by the DOE Joint Genomics Institute and will soon be available to the public.
Here we report the sequencing of the chloroplast genome of the Bd21 genotype of Brachypodium, and perform a sequence analysis and phylogeny reconstruction with the completely sequenced chloroplast genomes from seven grass species. We compare the evolutionary dynamics of Brachypodium chloroplast genes with those of wheat, rice and maize, and discuss the significance of some indels in the framework of grass evolution.
Sequencing of the Brachypodium chloroplast genome
Sequencing of plastid genomes is usually done by isolation of chloroplasts followed by purification and amplification of plastid DNA for library construction. To sequence the chloroplast genome of Brachypodium distachyon, we took advantage of existing BAC libraries [12] and identified several chloroplast BACs from a database of BAC end sequences (BES). In our analysis, 1,725 BES matched wheat chloroplast queries. Clones generated from a single restriction of the chloroplast genome should contain the entire chloroplast genome and its two BES would assemble in the same region in opposite orientations. The two BES from BAC DH037I03 matched back-to-back the sequence of the wheat psbC gene (Fig. 1C). Overall, we identified over 30 BACs harboring the complete chloroplast genome, suggesting that this strategy is efficient in identifying full-length chloroplast genomes from genomic BAC libraries.
As expected, the chloroplast sequence assembled using the BES contained many gaps due to the distance between restriction sites (Fig. 1). To complete the Brachypodium chloroplast genome, a shotgun sequencing library of DH037I03 was constructed. The complete genome sequence was assembled using 1,725 BES, 410 sequences from the shotgun library, and 264 gap-filling sequences generated by primer walking. The sequence coverage of the entire chloroplast genome is 8.9×.
Genome organization of Brachypodium chloroplast
The chloroplast genome of Brachypodium distachyon is 135,197 bp in length. The Inverted Repeats (IR) are 21,540 bp in length each, and the Large Single Copy (LSC) and Small Single Copy (SSC) regions are 79,446 bp and 12,668 bp long respectively. The Brachypodium chloroplast genome contains 118 unique genes, 18 of which are duplicated in the IRs, making a total of 136 genes of known function. In addition, there are 9 predicted open reading frames (ORFs) and 3 tRNA pseudogenes. With a few exceptions discussed below, the gene number and order are identical to other grass chloroplast genomes (Fig. 2).
Grass chloroplast phylogeny based on complete chloroplast genomes
In a landmark article that included data from multiple sources, the Grass Phylogeny Working Group [13] examined relationships among grasses using a large and diverse assemblage of species. That study highlighted the existence of two major lineages, the BEP clade and the PACCAD clade, that together encompass the majority of grasses. The BEP clade includes the subfamilies Bambusoideae, Ehrhartoideae, and Pooideae. Rice belongs to subfamily Ehrhartoideae while wheat, barley, bentgrass, and Brachypodium are in the Pooideae. The PACCAD clade includes several subfamilies, among them the Panicoideae, a large group of mainly tropical and subtropical species, some of which are important crops worldwide, like maize, sugarcane, and sorghum.
So far, all phylogeny reconstructions of the Poaceae have used selected genes or partial regions as data. However, with sequenced chloroplast genomes of several species in this family and the computer power to align them, it is possible for the first time to perform whole chloroplast genome phylogenic analyses. To examine if the genome-wide phylogenic analysis is consistent with those based on selected genes, we employed Bayesian [14] and Maximum Parsimony [15] methods to reconstruct a grass phylogeny using whole chloroplast sequences. Both Bayesian and Maximum Parsimony estimates produced the same topology with maximum node support (Fig. 3). The topology shown on Fig. 3 contained 99% of the Bayesian credible trees and the tree is in agreement with the results obtained with a larger group of species [13]. The phylogram also shows that branches in the BEP clade are much longer than those in the PACCAD clade. A similar result was found by Saski et al. [16] in a phylogenetic study using 61 protein-coding genes, indicating that the rates of evolution are higher in the BEP clade compared to the PACCAD species sampled here. However, it is possible that these slower rates do not extend to other species of the PACCAD clade, since maize, sorghum, and sugarcane are closely related, with all three belonging to subfamily Panicoideae.
Evolution of Brachypodium chloroplast genes
For a given protein-coding gene, the proportion of substitutions that do not cause a change in the amino acid sequence (synonymous) to those that do (nonsynonymous) is a commonly used estimator of the evolutionary dynamics operating on that gene [15]. To find out if Brachypodium plastid genes show the same evolutionary dynamics as other grasses we calculated the ratio of nonsynonymous to synonymous substitution rates for Brachypodium chloroplast genes using tobacco as an outgroup.
We found that the nonsynonymous/synonymous ratios for Brachypodium chloroplast genes are similar to those of rice, maize and wheat, with photosynthetic genes having the lowest ratio (Table 1), in agreement with previous findings [17]. Within the NADH class, ndhB and rps12 have very low rates of both kinds of substitutions compared to other genes in the same class, a result explained by their position, in the IRs and most likely due to the dynamics of the IRs' evolution rather than to evolutionary constrains on ndhB and rps12.
The rate of evolution of a particular gene; i.e., the estimated number of substitutions per site, can vary among different organisms for reasons like rapid gene duplication that creates opportunity for sequence divergence, different generation time, and various DNA repair mechanisms [15]. We conducted a relative rate test [18] for all Brachypodium chloroplast genes with known function against their orthologs in maize, wheat, and rice and found that most Brachypodium genes evolve at similar rates to those of wheat, rice, and maize. However, there are unequal rates of evolution (at P = 0.05) in 15 genes and 17 cases of species comparisons, and Brachypodium genes evolved at a faster rate in 14 out of those 17 comparisons (Table 2).
Sequence comparison among grass chloroplast genomes
The structure and gene number of the chloroplast genome is very similar among land plants, although the Poaceae have three large inversions compared to the canonical plastid genome usually represented by the tobacco chloroplast genome [19]. This conservation of overall structure in the chloroplast genomes of grasses allowed us to align the chloroplast genome sequences of eight grass species at the genome-wide level.
Comparison of the sequences of eight chloroplast genomes (only rice, Brachypodium, wheat, and maize are represented on Fig. 2) reveals several regions of high sequence length polymorphism, as well as shared deletions and insertions. The IRs show lower sequence divergence among grasses than the single-copy region (Fig. 1), a result previously reported by other authors [20]. The region between rbcL and psaI (at position ~54 kb, Fig. 2) is one of the most polymorphic chloroplast loci in grasses. In rice, this region is 1532 bp long and contains ORF133 and the accD gene, but it is much shorter in other grasses. In Brachypodium, both ORF133 and accD are missing, and the entire rbcL-psaI spacer region, containing only the rbcL 3'UTR and psaI promoter sequences, is reduced to 296 bp long.
As expected from its phylogenetic placement, Brachypodium shares several indels with barley, wheat, and bentgrass, all of which are in subfamily Pooideae, including a 410 bp deletion in ORF70 (~14.5 kb, Fig. 2) and the duplication of a 5' portion of ndhH IRb (~102 K in Fig. 2) that is also shared with rice [16, 21]. The size of this duplication is variable, ranging from 238 bp in rice to 311 bp in Brachypodium. Insertions in rpoC2 (~25 K, Fig. 2) have been described and used previously in phylogenetic analyses [[13], and references therein] and will not be discussed here.
Rice and wheat have identical and independently derived deletions
Despite the overall sequence conservation of IRs, the region between ndhB and trnI (~84 K and ~131 in Fig. 2) appears to be a hot spot for large indels. Previously, Ogihara et al [21] described a 2,131 bp deletion in wheat and rice with respect to maize. This deletion is located between ORF249 and ORF28 (~84 K and ~131 K, Fig. 2). Because rice is more closely related to wheat than to maize, the authors concluded that the deletion was present in the common ancestor of rice and wheat. However, this deletion is present only in rice and wheat, which are not sister species (Fig. 3), whereas in Brachypodium, barley, and bentgrass there is a smaller deletion of about 1,141 bp (Fig. 4).
To confirm that the 2,131-bp deletion in rice and wheat was not an artifact of the alignment or missing sequence, we used the Brachypodium sequence missing in wheat and rice and blasted it against grass sequence databases. We recovered sequences from many grasses except wheat and rice, confirming the presence of the deletion in their genomes. In addition, we searched the GenBank angiosperm databases with the maize sequence corresponding to the deleted wheat and rice region and found that the region is present in species representing diverse lineages of flowering plants, including the monocot Dioscorea, the early-diverging angiosperms Amborella and Nymphaea, and several core eudicots (data not shown). Therefore, we concluded that the 2,131-bp deletions in the wheat and rice chloroplast genomes are derived characters that arose independently in those species.
The 2,131-bp deletions in rice and wheat are identical in both IRs and the sequences bordering them align unambiguously with those of other grasses (Fig. 4). In addition, the lack of direct short repeats in sequences indicates that recombination via short repeats is not the way by which they arose. Thus, despite the fact that deletions of varying lengths in the ndhB-trnI region seem to be common in the BEP clade, the mechanism underlying these specific deletions remains unclear. In tobacco, nucleotide mutations in plastid coding sequences are quickly eliminated by gene conversion, a process facilitated by the polyploid nature of the plastid genome [22]. Whatever the mechanism is that generates deletions in the trnI-ndhB region in species of the BEP clade, their multiple occurrences suggests that they may provide a selective advantage to those species in order to overcome gene conversion and become fixed in the population.
References
Staehelin LA, Newcomb EH: Membrane structures and membranous organelles. Biochemistry and Molecular Biology of Plants. Edited by: Buchanan BB, Gruissem W, Jones RL. 2000, Rockville, MD: American Society of Plant Biologists, 37-45.
Palmer JD: Plastid chromosomes: structure and evolution. The molecular biology of plastids Cell culture and somatic cell genetics of plants. Edited by: Hermann RG. 1991, Vienna: Springer, 7A: 5-53.
Bock R: Plastid biotechnology: prospects for herbicide and insect resistance, metabolic engineering and molecular farming. Current Opinion in Biotechnology. 2007, 18: 100-106. 10.1016/j.copbio.2006.12.001.
Garvin DF, Gu YQ, Hasterok R, Hazen SP, Jenkins G, Mockler TC, Mur LAJ, Vogel J: Development of genetic and genomic research resources for Brachypodium distachyon, a new model system for grass crop research. The Plant Genome [A Supplement to Crop Science]. 2008, 1: S-69-84.
Vogel J, Gu YQ, Twigg P, Lazo G, Laudencia-Chingcuanco D, Hayden DM, Donze TJ, Vivian LA, Stamova B, Coleman-Derr D: EST sequencing and phylogenetic analysis of the model grass Brachypodium distachyon. Theoretical and Applied Genetics. 2006, 113 (2): 186-195. 10.1007/s00122-006-0285-3.
Huo N, Gu YQ, Lazo G, Vogel J, Coleman-Derr D, Luo M-C, Thilmony R, Garvin DF, Anderson OD: Construction and characterization of two BAC libraries from Brachypodium distachyon, a new model for grass genomics. Genome. 2006, 49: 1099-1108. 10.1139/G06-087.
Hasterok R, Draper J, Jenkins G: Laying the cytotaxonomic foundations of a new model grass, Brachypodium distachyon (L.) Beauv. Chromosome Research. 2004, 12: 397-403. 10.1023/B:CHRO.0000034130.35983.99.
Jenkins G, Hasterok R: BAC 'landing' on chromosomes of Brachypodium distachyon for comparative genome alignment. Nature Protocols. 2007, 2: 88-98. 10.1038/nprot.2006.490.
Hasterok R, Marasek A, Donnison IS, Armstead I, Thomas A, King IP, Wolny E, Idziak D, Draper J, Jenkins G: Alignment of the genomes of Brachypodium distachyon and temperate cereals and grasses using bacterial artificial chromosome landing with fluorescence in situ hybridization. Genetics. 2006, 173: 349-362. 10.1534/genetics.105.049726.
Vogel J, Garvin DF, Leong O, Hayden DM: Agrobacterium-mediated transformation and inbred line development in the model grass Brachypodium distachyon. Plant Cell, Tissue and Organ Culture. 2005, 84: 199-211.
Christiansen P, Andersen CH, Didion T, Folling M, Nielsen KK: A rapid and efficient transformation protocol for the grass Brachypodium distachyon. Plant Cell Reports. 2005, 23: 751-758. 10.1007/s00299-004-0889-5.
Huo N, Lazo G, Vogel J, You FM, Ma Y, Hayden DM, Coleman-Derr D, Hill TA, Dvorak J, Anderson OD, et al: The nuclear genome of Brachypodium distachyon: analysis of BAC end sequences. Functional and Integrative Genomics. 2007, electronic version.
GPWG: Phylogeny and subfamilial classification of the grasses (Poaceae). Annals of the Missouri Botanical Garden. 2001, 88 (3): 373-457. 10.2307/3298585.
Huelsenbeck JP, Ronquist F: MRBAYES: Bayesian inference of phylogeny. Bioinformatics. 2001, 17: 754-755. 10.1093/bioinformatics/17.8.754.
Nei M, Kumar S: Molecular Evolution and Phylogenetics. 2000, Oxford: Oxford Universiy Press
Saski C, Lee S-B, Fjellheim S, Guda C, Jansen R, Luo H, Tomkins J, Rognli OA, Daniell H, Clarke JL: Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor, and Agrostis stolonifera, and comparative analyses with other grass genomes. Theoretical and Applied Genetics. 2007, 112 (8): 1503-1518.
Matsuoka Y, Yamazaki Y, Ogihara Y, Tsunewaki K: Whole chloroplast genome comparison of rice, maize, and wheat: implications for chloroplast gene diversification and phylogeny of cereals. Molecular Biology and Evolution. 2002, 19 (12): 2084-2091.
Tajima F: Simple methods for testing the evolutionary clock hypothesis. Genetics. 1993, 135: 599-607.
Doyle JJ, Davis JI, Soreng RJ, Garvin DF, Anderson MJ: Chloroplast DNA inversions and the origin of the grass family (Poaceae). Proceedings of the National Academy of Sciences. 1992, 89: 7722-7726. 10.1073/pnas.89.16.7722.
Yamane K, Yano K, Kawahara T: Pattern and rate of indel evolution inferred from whole chloroplast intergenic regions in sugarcane, maize, and rice. DNA Research. 2006, 13: 197-204. 10.1093/dnares/dsl012.
Ogihara Y, Isono K, Kojima T, Endo A, Hanaoka M, Shiina T, Terachi T, Utsugi S, Murata M, Mori N, et al: Structural features of a wheat plastome as revealed by complete sequencing of chloroplast DNA. Molecular Genetics and Genomics. 2002, 266: 740-746. 10.1007/s00438-001-0606-9.
Khakhlova O, Bock R: Elimination of deleterious mutations in plastid genomes by gene conversion. The Plant Journal. 2006, 46: 85-94. 10.1111/j.1365-313X.2006.02673.x.
Mayor C, Brudno M, Schwartz JR, Poliakov A, Rubin EM, Frazer KA, Pachter LS, Dubchak I: VISTA: Visualizing global DNA sequence alignments of arbitrary length. Bioinformatics. 2000, 16: 1046-10.1093/bioinformatics/16.11.1046.
Swofford DL: PAUP*. Phylogenetic analysis using parsimony (*and other methods), version 4. 2003, Sunderland, Massachusetts, USA: Sinauer
Nei M, Gojobori T: Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Molecular Biology and Evolution. 1986, 3 (5): 418-426.
Acknowledgements
We thank Naxin Huo for her help with BAC end sequencing. This work was supported in parts by the United State Department of Agriculture, Agriculture Research Service CRIS projects 532502100-010 and 532502100-011.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
EB did the sequence alignment and comparison, the phylogenetic analyses, the relative tests of evolution, and drafted the manuscript. DC-D performed the BAC end sequence searches, BAC shotgun library construction and sequencing, sequence assemblage, and substitution rates analyses. GRL wrote the algorithm to search BES with wheat queries and assembled BES on the genome. YGU designed and coordinated the study. Both OA and YQG supervised the work and collaborated in the manuscript preparation. All authors have read and approved the final version of the manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Bortiri, E., Coleman-Derr, D., Lazo, G.R. et al. The complete chloroplast genome sequence of Brachypodium distachyon: sequence comparison and phylogenetic analysis of eight grass plastomes. BMC Res Notes 1, 61 (2008). https://doi.org/10.1186/1756-0500-1-61
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1756-0500-1-61