Keywords

12.1 Introduction

The genome is defined as an organism's complete set of genetic materials. Tom Roderick coined the term genomics in 1920, which is the study of the genome and is a more recent idea than genetics and genes. Genetics entails studying genetic information, the transmission of genetic information from generation to generation in the form of DNA, and the study of a restricted number of genes. In a pre-genomic era, the mechanism of information transfer to new cells at the functional level was not precise (Goldman and Landweber 2016; Hjort et al. 2010), which could be explored in the last few decades due to the advancement in molecular techniques. The human genome consists of the nuclear genome inside the nucleus, having 3.2 billion base pairs and 35000 genes (Schneider and Grosschedl 2007). The DNA packaging into the chromosome occurs in a highly systematic manner and is regulated at various levels. The genome inside the nucleus in base-pair information is a highly compact nucleosome structure made up of histone and other proteins and nuclear DNA. The highly repetitive and large number of noncoding sequences are involved in the regulatory mechanism. Apart from the nuclear genome, the extra-nuclear genome, which includes the mitochondrial and chloroplastic genome, plays an essential role like the initiation of apoptosis, aging, a large amount of ATP production, amino acid biosynthesis, steroid biosynthesis, β-oxidation of lipid, in mitochondria and photosynthesis, and important biochemical reactions in the chloroplast

According to endosymbiont theory, these organelles evolved separately via endosymbiosis, holding and merging free-living bacteria and cyanobacteria with the host eukaryotic cell around 1.4 billion years ago (Smith and Keeling 2015). This mechanism led to structural diversity as well as functional diversity of both the organelles. When an α-proteobacterial endosymbiont integrates into the eukaryotic host organism existing at the moment, mitochondria were formed, which evolved with due course of time. Because all eukaryotic cells include mitochondria, we may certainly propose that α-proteobacteria are the ancestors of all extant eukaryotic organisms. Photosynthetic cyanobacteria merged with the eukaryotic progenitor of archiplastida via chloroplast-derived letter via initial endosymbiosis to form chloroplast

12.2 Organelles Genome

The first mitochondrial genome was sequenced entirely in 1981 (Bibb et al. 1981; Anderson et al. 1981), whereas the chloroplastic genome of Marchantia polymorpha and tobacco was entirely sequenced five years later in 1986 (Shinozaki et al. 1986; Ohyama et al. 1986). Because of their evolutionary histories, mitochondrial and chloroplastic genomes share high sequence similarities. The organelle genome size (Mt genome, Pt genome) gradually decreased over billions of years in comparison to free-living α-proteobacteria and cyanobacteria through the transfer of several genes involved in DNA repair mechanisms, replication, from the nucleus to organelle (Timmis et al. 2004). It is also important for functional and structural stability, as well as a variety of metabolic processes. The mitochondrial genome's regulatory element is not confined to noncoding regions within genes (Lee and Han 2017)

12.3 Plant Mitochondrial Genome

Mitochondria are an important cellular organelle in the plant responsible for growth, fitness, reproduction, energy generation, metabolism, and cell homeostasis due to its semi-autonomous genetic system. Plant mitochondrial genome encodes several critical polypeptides involved in oxidative phosphorylation chain complex formation. Plant mitochondrial DNA (mtDNA) varies from that of animals and fungi in several ways. Higher plants have a larger mitochondrial genome than lower ones. In addition, mitochondria have nucleoids that replicate independently of the nuclear chromosome in many plants although many components of the replication machinery are derived from nuclear-encoded proteins. Plant mtDNA sequences evolved deliberately as compared to the mtDNA of animals, where point mutations were infrequent.

The mitochondrial DNA (mtDNA) originated from the symbiotic ancestral genomes of α-proteobacteria through the endosymbiosis process. The genomes of the mitochondria have an array of distinct characteristics in higher plants. Plant mitochondrial genomes are significantly bigger and vary greatly in size, even among relatively similar. (Allen et al. 2007; Kubo and Newton 2008). While the mitochondria of most mammals have circular DNAs having around 15–17 kb, plant mitochondrial genomes are quite larger, which vary significantly in terms of their size. In angiosperms, they typically range from 200 to 750 kb (Anderson et al. 1981) although some lineages have larger genome size, e.g., the organization of the mitochondrial genome into three independent chromosomal structures having 1556, 84, and 45 kb, respectively, is attributed to the expansion of dispersed repeats and existing introns, and accretion of nuclear, plastidial, viral, and bacterial sequences in cucumber (Cucumis sativus) (Alverson et al. 2011). The mitochondrial genome of the plants is enormous, yet their ploidy seems to be less, which is surprising. Plants have significantly lower mtDNA levels than human cells, which can have thousands of copies of mtDNA (Preuten et al. 2010). The Arabidopsis thaliana mtDNA is 367 kb in size, encodes 32 protein-coding genes, 22 transfer RNAs, and three ribosomal RNAs (5S, 18S, and 26S). On the other hand, the 16.5 kb human mtDNA code for 13 proteins, two rRNAs (12S and 16S), and 22 tRNAs (Anderson et al. 1981; Unseld et al., 1997; Stupar et al. 2001). Simpler organisms can have more mitochondrial genes, e.g., Reclinomonas americana having a 69 kb mtDNA encodes around 100 genes (Lang et al. 1997). The structure of the mitochondrial genome of the Arabidopsis thaliana has been shown in Fig. 12.1a. Mitochondrial genome sequences of plants have become more widely available, but the origin and role of noncoding DNA remain unknown, and thus, it is difficult to compare various species. The analysis of whole mitochondrial genomes of two Arabidopsis thaliana ecotypes, C24 and Columbia-0 (Col-0) (Davila et al. 2011), allows for noncoding sequence comparison and molecular evolution organelle genome in plants. Plant mtDNA contains numerous introns and repetitive sequences (accounting for 90 percent of the total sequences). It is susceptible to various gene gain, gene loss, gene transfer, and duplication, as well as genomic rearrangements (Li et al. 2020). Although mitochondrial genomes’ structure and gene composition vary from species to species in plants, gene-coding sequences of most seed plants species develop gradually, with synonymous substitution rates 100 times lesser than that of mammalian mitochondria (Kitazaki and Kubo 2010).

Fig. 12.1
figure 1

The basic structure and types of plant mitochondrial genome. Arabidopsis mitochondrion with its circular DNA and mitoribosomes inside the matrix. A circular 367 kb Arabidopsis thaliana mitochondrial DNA having inverted repeats and direct repeats (a). Different conformations of mitochondrial DNA found in angiosperms, e.g., circular, linear, and sigmoid type (b)

Despite the fact that the nucleoid of the plant mitochondria is less well-known, it most likely possesses the PolI-like DNA polymerases1A and PolI-like DNA polymerases1B, the replicative DNA primase-helicase (TWINKLE), the type II topoisomerase Gyrase, the RecA-like recombinases (RECA2, RECA3), the SSB-like ssDNA-binding proteins (SSB1, SSB2), the RecA-like recombinases (RECA2, RECA), RPOTm and RPOTmp phage-type RNA polymerases, the MutS-like homolog MSH1, and some of the additional proteins (Palmer and Herbon 1988; Xu et al. 2011). DNA repair and homologous recombination are also facilitated by several of these proteins.

12.4 Structure of the Chloroplast Genome

A plant biologist discovered the chloroplast genome for the first time in 1950. Due to the lack of advanced techniques at that time, comparative restriction site mapping and cloning were being used to study genome organization, structure, and gene order. The genome size is more variable than the mitochondrial genome size and highly conserved, ranging from 35 to 217 kb in most plants species (Chumley et al., 2006). The chloroplast genome consists of 79 protein-coding genes out of 113 total genes, where 30 genes code for transfer-RNA and 4 genes encode rRNA genes in tobacco. It is a circular and quadripartite structure having a unique sequence like a small single-copy region(SSC), a large single-copy region(LSC), and one pair of inverted repeats (IRs) (Yang et al., 2010). The structure of the chloroplast genome of Nicotiana tabacum with its different regions has been provided in Fig. 12.2a. In the inverted repeat (IR), 5 protein-coding, 4 rRNA, and 7 tRNA genes are present, which evolved during the duplication event. The LSC region consists of 62 protein-coding, 22 tRNA genes in contrast to the SSC region, which contains 12 protein-coding genes and a single tRNA gene. In the chloroplast genome, only 12 genes possess introns, out of which ycf3 and clpP genes contain two introns, and the other genes possess a single intron. In the LSC region of 5′end, trans-splicing occurs in rps12 genes. Duplication event has also been reported in the IR regions of the chloroplast genome (Liu et al. 2016). The chloroplast genome possess some protein-coding region having several initiator codons. The psbC and rps19 use the GUG initiator codon, while the rpl2 and ndhD use ACG initiator codon. The schematic representation of the start codon diversity has been given in Fig. 12.2b. The rest 75 genes code universal tRNA. Translational efficiency is higher in the non-canonical initiator GUG codon than the canonical initiation codon AUG. In Oryza minuta, the initiator codon of the rpl2 and rps19 genes are ACG and GUG, respectively. AUG was used as the initiator codon in the rps19 and psbC genes in plants belonging to the Brassicaceae family (Kuroda et al. 2007; Hu et al. 2015). The chloroplast genome has a large molecular marker to study large-scale evolutionary concepts in different taxonomic groups. One of the advanced techniques is DNA barcoding, in which a small segment of DNA from a given gene(s) is used for species identification. Some chloroplast genes like matK, rbcL, and ycf1 act as barcodes to study evolutionary conservation in the plant kingdom via DNA barcoding (Asaf et al. 2017; Wu et al. 2010).

Fig. 12.2
figure 2

Structure and strat codon diversity in tobacco chloroplast genome A circular 156 kb tobacco chloroplast genome with different regions, i.e., LSC (large single copy), IR-A (inverted repeat A), IR-B (inverted repeat B), and SSC (Small single copy) regions (a). Variation in the start codon in various genes of the chloroplast genome, e.g., GUG as start codon for psbC (photosystem II 44 kDa protein) and rps19 (ribosomal protein subunit 19) gene; ACG as start codon for rpl2 (ribosomal protein large subunit 2) and ndh2 (NADH dehydrogenase subunit 4) gene (b)

12.5 Mitochondrial Genome Diversity in Angiosperm Plants

Rearrangements among large repetitive sequences within the genome often occur in mitochondria of most angiosperms, and the genome exists in different conformations inside the mitochondria among different species (Mower et al. 2012). Consequently, mitochondrial genomes in angiosperms have multipartite genome maps that can be depicted as a single “master circular” or a collection of subgenomes (Palmer and Shields 1984; Sugiyama et al. 2005; Arrieta-Montiel and Mackenzie 2011). In flowering plants, mitochondrial genomes highly vary in structure and size (Mower et al. 2012). While the genomes are typically depicted as single circular rings, plant mitochondrial chromosomes have also been found in a variety of shapes and sizes, which include linear or circular forms, highly branched or σ-like structures, and multi-chromosomal structures capable of co-existing in a single plant (Kitazaki and Kubo 2010). The different conformations of the genome inside the mitochondria have been depicted in Fig. 12.1b.

Some CMS (Cytoplasmic Male sterility) lines in maize and rice, as examples, exhibit linear mitochondrial genome (Allen et al. 2007; Notsu et al. 2002). The mitochondrial genome in the case of Eruca sativa is multipartite, with six bigger circular and four smaller subgenomic circular DNA, indicating that repeat-induced genomic rearrangement is possible (Wang et al. 2014). Brassica oleracea, which has a tripartite structure having a 220 kb circular genome divided into two smaller circular genomes of 170 and 50 kb by homologous recombination of repetitive sequences, has a similar arrangement (Grewe et al. 2014).

12.6 Mitochondrial Genome Stability

The mitochondrial genome has a high risk of DNA damage due to the high ROS produced during electron transport in the respiratory mechanism (Møller 2001). Plants use different mitigating strategies for their genome stability and proper functioning. Repeated sequences play a significant part in higher plants for the stability of mtDNA (Kmiec et al. 2006). It may also function in homologous recombination and, as a result, have a significant influence on the structure of mtDNA. Massive repeated sequences (>1 kb), average-size repeats (50–500 bp), and micro-homologies (50 bp) are the three types of mtDNA repeats. Large size repeats are frequently engaged in changeable reciprocal recombination processes that regulate mtDNA flexibility in plants, which is generally made up of a combination of interconvertible subgenomes (Negruk et al. 1986; Oda et al. 1992). Homologous recombination (HR) appears to be the main DNA repair mechanism in plant mitochondria, with the end-joining mechanism being uncommon. HR is also required for the replication and segregation of mtDNA and is accountable for the genome’s fast evolution (Cappadocia et al. 2010). As an outcome, the large number of HR must precisely be controlled to circumvent intragenomic rearrangements that might be harmful to the mitochondria because of repetitive sequences in mtDNA (Wallet et al. 2015).

Nuclear genes that play a crucial role in the stability of mtDNA have been discovered through studies in mutants with variegated leaves or by altering genes previously thought to be involved in organelle DNA metabolism (Maréchal and Brisson 2010). RECA1 was found in the mitochondria of the bryophyte Physcomitrella patens which are the homologs of bacterial RecA protein (Terasawa et al. 2007; Inouye et al. 2008). A RECA1 knock-out strain has abnormalities in development and mitochondrial morphology and a decreased rate of mtDNA repair (Terasawa et al. 2007; Odahara et al. 2009). Furthermore, the RECA1 knock-out mutant exhibits large rearrangements due to abnormal recombination between short repeats ranging from 62 to 84 bp dispersed across mtDNA, indicating that RECA1 keeps mt genome stable by inhibiting gross rearrangements (Odahara et al. 2009). Mutations in plant-specific single-strand DNA-binding proteins, such as WHY2 from the whirly protein family (Cappadocia et al. 2010) and organellar single-stranded DNA-binding protein 1 (OSB1), also cause abnormal recombination between repeats (Dong et al., 2013). Recombination occurs between short repetitions (30 bp) in the OSB1 mutant and is gyrase inhibitor-dependent in the WHY2 mutant (Cappadocia et al. 2010).

12.7 Chloroplast Genetic Engineering

There are currently new technologies for sophisticated chloroplast genome engineering. Through enabling genome engineering of chloroplast genomes in major crops and increased expression of foreign genes using modular vectors, RNA interference (RNAi), and crop-specific vectors, multigene engineering can be utilized to create high-value desired bioproducts. Introgression of a single gene into the chloroplast genome can be employed to give biotic/abiotic stress tolerance or increase biomass (Jin and Daniell et al. 2015). Retrograde signaling allows genes expressed in chloroplasts to regulate the nuclear genome. Photosynthesis is the process of producing vital energy sources for the whole world as a form of ATP, oxygen, fuel, food, etc., done in the chloroplast. The chloroplast genome can be modulated by the genetic engineering and biotechnology tool for enhanced agronomic traits, medicines, and industrial enzymes for a beneficial approach. Chloroplast genome engineering is used for stable introgression and expression of the foreign gene from the heterologous system like fungus, virus, animal for producing biopharmaceutical proteins, industrial enzymes, antibiotics, and benefitted agronomic traits (Chen et al. 2014). Plastid transformation has various advantages like the very less unwanted effect of transgene due to compartmentalization, transgene confinement via maternal or uniparental inheritance. Double homologous recombination is required for chloroplast transformation (Verma and Daniell 2007; Verma et al. 2008). The diagrammatic representation of chloroplast genome engineering and its applications for plant trait enhancement has been provided in Fig. 12.3.

Fig. 12.3
figure 3

Application of chloroplast genome engineering for various purposes, i.e., for increased CO2 fixation efficiency via CCM mechanism, production of essential volatile compounds necessary for unique traits, and modulation of insect resistance in target host plants

Hundreds of genes from other organisms have been expressed in chloroplasts, which exhibit substantially elevated expression than expression systems inside the nucleus in many situations. However, the requisite expression level has still not been obtained in a few cases. Protein N-terminal degradation is a well-known phenomenon in heterologous systems. Insulin, the most renowned chimeric protein in the blood, has not been expressed without N-terminal fusion proteins in any expression vector (Lee et al. 2011). As a result, numerous human therapeutic proteins have been discovered. To confer stability, green fluorescent protein (GFP) has been expressed in chloroplasts by tagging with cholera toxin B (CTB) to aid oral delivery and stability (Lee et al. 2011; Kohli et al. 2014; Shenoy et al. 2014).

12.7.1 Agronomic Trait Enhancement Via Chloroplast Modulation

Synthetic antimicrobial peptide production by chloroplast genome engineering for protection against various fungal infections, bacterial, pathogens, viral particles, and abiotic stress has also been established in few studies (Lee et al. 2011; Kwon et al. 2013). The expression of β-glucosidase in chloroplast activates the ester conjugates to produce hormones (indolyl-3-acetic acid, zeatin, gibberellic acid). It has been used for high biomass, height, internode length, increase leaf area in the targeted plants. It also provides protection against whitefly and aphid via sugar esters which are produced in high amounts in the dense globular trichomes on the leaf surface (Jin et al. 2011).

12.7.2 Engineering of the Metabolic Pathway in Chloroplast for the Beneficial Product

The heterologous expression of chloroplastic enzymes involved in the isoprenoid pathway (mevalonate pathway) without any regulatory sequences like (promoter, UTR) in the host chloroplast shows an increase in the amount of mevalonate, carotenoids, sterols, squalene, and triglycerols, in the transformed plants (Kumar et al. 2012). Carotenoids are photoprotective compounds synthesized by the terpenoids pathway. Astaxanthin, a carotenoid, acts as an antioxidant and is responsible for the pigmentation of salmon and few other organisms. Transfer of gene isopentenyl diphosphate isomerase, β-carotene hydroxylase, and β-carotene ketolase from marine bacteria to the chloroplast genome of lettuce produced key carotenoids (ketocarotenoids) astaxanthin fatty acid esters (Harada et al. 2014). Lipid soluble tocopherols act as antioxidants in plants and are known for scavenging of the reactive oxygen species (ROS). The α-tocopherol, an isoform of the tocopherol, was synthesized by the expression of tocopherol cyclase (TC) and γ-tocopherol methyl-transferase (γ-TMT) in the chloroplast. Some rate-limiting enzymes like homogentisate phytyltransferase (HPT), TC, and γ-TMT elevate the total tocopherol content up ten-fold. Enhanced synthesis of α-TOC in transplastomic plants enhances the nutritional, biotic, and abiotic stress tolerance by reducing the ROS level, lipid peroxidation, and ion leakage in plants (Lu et al. 2013; Jin and Daniell 2014).

12.7.3 Enhancement in Photosynthesis Efficiency Via Plastid Engineering

Ribulose-1,5-bisphosphate carboxylase/oxygenase, abbreviated as RuBisCO, is a major enzyme operating in the Calvin cycle that has sparked interest to improve the catalytic activity, carbon fixation efficiency, and decrease oxygenase activity. The assembly of small subunits of Rubisco from the nucleus to chloroplast forms functional Rubisco. Early efforts entailed relocating the small subunit gene to the chloroplast genome (Dhingra et al. 2004). The heterologous expression of Rubisco subunits in chloroplasts has been the subject of more recent investigations. The CO2-concentrating mechanism (CCM) from cyanobacteria has recently been introduced into transplastomic plants, which has resulted in a breakthrough in crop improvement (Lin et al. 2014). Within chloroplasts, hybrid assembly of Rubisco from Synechococcus elongates Se7942 and CcmM35, a β-carboxysomal protein, led to enhanced carbon fixation efficiency but decreased growth. This is a significant footstep toward chloroplast genetic engineering to increase photosynthesis. Other discoveries increased recombinant Rubisco biogenesis by simultaneous expression of the RuBisCO accessory chaperone RAF1 and RuBisCO (Whitney et al. 2015).

12.7.4 Chloroplast Genome Engineering for Insect Resistance

Although significant progress has been witnessed in expressing biopesticide genes from a gram-positive bacteria Bacillus thuringiensis in the chloroplast for the production of Bt toxin crystals, plastidial expression of these genes in many crops has yet to reach commercial development because of the saturation of the market value of Bt crop products. However, a new revelation of successful Bt resistance has prompted the US Environmental Protection Agency (EPA) regulations to plant Bt maize (De Cosa et al. 2001; Dufourmantel et al. 2005). As a result, recent attention has shifted to identify novel features or methodologies to aid commercial development.

The chloroplast genome has recently been engineered using RNA interference technology (RNAi). Chitin synthase (CHS), V-ATPase, and cytochrome P450 monooxygenase (CYP450) found in lepidopterans were chosen as the target for RNAi in this work. Cleaved and processed dsRNA had a higher quantity than the native psbA transcript, which was robustly expressed (Redick et al. 2015). The expression level of the targeted genes was lowered significantly in the mid-gut of the insects feeding on leaves, having silenced CYP450, CHI, and V-ATPase gene, most likely due to additional processing of engulfed siRNA in the insect gut. The larvae’s net weight and their development and pupation rates were drastically decreased. In separate research, Bock and colleagues used the stably expressed dsRNA in the chloroplast genome against the insect β-actin gene and evoked resistance to the potato beetle; this groundbreaking work was confirmed as effective in the field trials.

12.8 Evolution of Organelle Genome

Both organelle genomes (Mt genome and Pt genome) originated from prokaryotes (alfa proteobacteria and cyanobacteria, respectively) through the endosymbiosis process. Both are independent events in eukaryotic evolution over a billion years ago. RNA editing refers to mechanisms that add or remove the specific nucleotides in RNA molecules and mechanisms that add or remove nucleotides. The information in the mature mRNA is different from the gene that encodes this mRNA (Aphasizhev and Aphasizheva 2011). The pentatricopeptide repeat (PPR) proteins are nuclear-encoded factors necessary for editing in sites of different mitochondria and plastids. RNA editing occurs in viruses, early eukaryotes, mammals, fungi, and plants in various ways. RNA editing mechanisms are employed as checkpoints, and they can maintain the function of the encoded protein and produce new proteins (Koito and Ikeda 2012). Comprehensive studies of numerous processes in diverse species that detail their mechanistic and functional features and their origins have recently been studied (Nishikura 2010).

12.8.1 Mitochondria Genome for Phylogeny Analysis

Phylogenetic and phylogenomic reconstructions suggest mitochondrial origin from a single ancestor, which is also known as monophyly. Two further pieces of genetic study back up the theory that all mitochondrial genomes originated from a single ancestor. The eubacterial origins of this genome have been adequately confirmed by studies on mitochondrial DNA (mtDNA). The mtDNA sequences have facilitated tracing the evolution of mitochondria from a single ancestor connected to the Proteobacteria (Timmis et al. 2004). Anaplasma, Rickettsia, and Ehrlichia are the closest eubacterial relatives of mitochondria in the rickettsial subgroup of the α-Proteobacteria, which are a group of intracellular obligate parasites that also includes the above 3 genera (Gillham 1994).

In several areas of biology, next-generation sequencing (NGS) has evolved as a valuable tool. One of the uses in evolutionary and phylogenetic investigations is the ability to swiftly and cheaply assemble organelle genomes. In evolutionary investigations, metazoan mitochondrial genome possessing 13 protein-coding genes, 22 tRNA, and 2 rRNA have been proved to be particularly helpful biomarkers (Yang et al. 1985). It may also be helpful in evolutionary studies by comparing the rates of base substitution among species, in addition to the other gene sequences. Moreover, the mitochondrial gene ordering can also help study phylogenetic inferences (Gissi et al. 2008).

RNA editing was discovered in the mitochondria of the flowering plants (angiosperms) because of the sequence variations between DNA and RNA in 1989 (Covello and Gray 1989). This variation in the locations of U nucleotides in RNA and C nucleotides in DNA was discovered to be caused C-to-U substitution in the RNA. After editing, the codons are more comparable to those found in orthologous proteins from other species at the same locations (Gualberto et al. 1989; Covello and Gray 1989).

In understanding species-level phylogenies, mitochondrial DNA data can be extremely useful. The arrangement of genes in the mitochondrion varies, and the extensive stretches of noncoding DNA divide them. The mitochondrial genome rearrangement occurs more often, resulting in several altered forms in a single cell. Because of the advancements in isolation methods of mtDNA, the use of restriction endonucleases to recognize specific nucleotide differences, PCR methodologies, the applicability of universal primers for DNA amplification in phylogenetics, and population genetic studies, mitochondrial genes are becoming progressively demanded in phylogenetics and population genetic studies (Borsch and Quandt 2009).

Cytochrome oxidase I/II, the electron transport-chain enzyme cytochrome c oxidase, is present in both bacteria and mitochondria. This gene, commonly employed to estimate molecular phylogenies, is gradually changing compared to other mitochondrial genes that encode protein (Lavrov and Lang 2005). The mitochondrial 12S rRNA sequence analysis is widely used in phylogenetic studies and molecular taxonomy. Its sequences have previously been proposed as a tool for determining intermediate to lengthy divergence dates (Lavrov and Meyer 1996). The cytochrome-b gene is the most helpful marker for resolving phylogenetic relationships between closely related species although it lacks clarity at deeper nodes. Although it has been reported to be beneficial in retrieving phylogenetically significant information about taxonomic levels, its use is lineage-dependent and diminishes as evolutionary depth increases.

12.8.2 Evolution of Chloroplast Genome and Its Use in Phylogeny Analysis

The gene content, structure, and organization of the chloroplast genome are largely conserved in comparison to the mitochondrial and nuclear genome. The rate of substitution in nucleotide sequences is higher in the chloroplast genome than mitochondrial genes but lower than the nuclear genome (Burger et al. 2003). However, a number of studies have shown evolutionary processes like gene duplications, mutations, deletions, and rearrangements. This organellar genome has long been regarded as a suitable model for comparative and evolutionary genomic investigations owing to its small size and preserved gene content. Comparative studies of chloroplast genomes have been conducted on several focused species, genera, or plant groups in recent years (Drouin et al. 2008; Dong et al., 2013).

RNA editing, including C-to-U alterations, has also been observed in the chloroplast, apart from (Hoch et al. 1991). All the plant lineages from simpler bryophytes to advanced angiosperms show this type of editing in plastids (Sugita et al. 2006). Several species from the order Marchantiales, for example, have mRNAs that stay as dictated by the chloroplast and mitochondrial genome (Rüdinger et al. 2008). RNA editing has not yet been discovered in plant cytoplasmic RNAs. It appears that the process is limited to these two organelles.

Comparative examinations of chloroplast genomes at higher taxonomic levels are valuable for phylogenetic research and in comprehending the evolution of the genomes in the context of genome size changes, gene deletions, and nucleotide changes. However, choosing a gene with the right length and substitution rate is critical. The atpB, matK, ndhF, rbcL, rpl16, and many more genes were currently utilized CpDNA genes for this purpose.

12.8.2.1 rbcL Gene

RuBisCO, the first enzyme in the C3 cycle, is the world’s most abundant and significant protein and a key component of the global carbon cycle (Raven 2013). The rbcL gene is a single-copy gene found on the cp-genome that has a lot of phylogenetic usefulness. The rbcL gene has a length of 1428 bp and is located in all plants, excluding some parasites (Dong et al. 2018). It is straightforward to examine, align. Its secondary confirmation is well-studied and exists as multiple copies with few insertions and deletions. The rbcL gene codes for a large subunit of RuBisCO, whereas the rbcS gene in the nucleus encodes a small subunit. The first and one of the most commonly sequenced regions of the plant is the rbcL gene. It has frequently been employed in systematic research of terrestrial plants, particularly angiosperms. Phylogenetic connections among angiosperms and extant seed plants were investigated using around 500 rbcL sequences. Even though rbcL is conserved and easily alignable among the taxa, it has a greater replacement rate than 18S rDNA (Chase et al. 1993).

12.8.2.2 MatK Gene

MatK (maturase), a maturase enzyme, is involved in the splicing mechanism of type II introns in the RNA transcripts. It was situated within the intronic region of the chloroplast gene trnK, which encodes lysine tRNA. This gene’s use in rectifying the intergeneric or interspecific connections among angiosperms has recently been demonstrated in research. The gene has high rates of substitution in comparison to other grass systematics genes and has a high proportion of transversion mutations. The three sections of its coding region help construct phylogenies in the Poaceae at the subfamily level (Patwardhan et al., 2014).

12.8.2.3 ndhF Gene

The ndhF genes, which encode for NADP dehydrogenase subunit F, are approximately 1100 bp long and are found in a single-copy region. The sequence variation among these genes was employed to rebuild a phylogenetic tree between 282 species representing 78 monocot groups. Furthermore, based upon rbcL alone or in combination with atpB and 18S rDNA, they demonstrated that relationships within orders are consistent. However, this gene provides more informative characters than rbcL and other genes (Givnish et al. 2006).

12.9 Conclusion

Besides serving as a tool for phylogenetic and evolutionary studies, the organellar genome can be used as a target for the improvement of essential agronomic traits. Plant chloroplast genome contributes mainly to the synthesis of carbohydrates, in addition to few other molecules. Our review gives a comprehensive and updated account of the basic structural organization of chloroplast and mitochondrial genome as studied in several plant species. A regulatory mechanism like RNA editing, DNA damage repairs inside the two organelles has also been discussed as studied in few plant species. Many researches have focussed on improving the photosynthetic efficacy and growth performance through genome engineering via transgene technology. However, more detailed studies on the various regulatory mechanism of gene expression need to be studied in major crop species.