Exploring the Genomes of Symbiotic Diazotrophs with Relevance to Biological Nitrogen Fixation

Thakur, Subarna; Bothra, Asim K.; Sen, Arnab

doi:10.1007/978-81-322-1880-7_14

Subarna Thakur⁴,
Asim K. Bothra⁵ &
Arnab Sen⁴

1904 Accesses
1 Citations

Abstract

Nitrogen fixation is an important process for the conversion of atmospheric nitrogen to its biologically available form, cutting across various ecosystems. Biological nitrogen fixation (BNF) is accomplished by a group of microorganisms, known as diazotrophs through the catalytic action of nitrogenase enzyme system. N-fixing microbes exhibit a wide diversity in terms of their habitat. Some are free-living, whereas others form symbiotic associations with higher plants. Symbiotic nitrogen fixation is particularly relevant in context of sustainable agriculture. Recent advances in next-generation sequencing technology have ensured the availability of thousands of complete genomes including those of symbiotic diazotrophs. Genomic approaches together with bioinformatics tools are now being used to define and understand the complex molecular relationships that underpin symbiotic nitrogen fixation. Accessibility of new computational tools for genomic and proteomic analysis has accelerated nitrogen fixation research predominantly in the areas of comparative genomics, protein chemistry, and phylogenetic analysis of nitrogen fixation genes. Alternative phylogenetic approaches and protein structure-based studies may prove to be quite prolific to divulge the unfamiliar aspects of symbiotic nitrogen fixation.

Access provided by Autonomous University of Puebla. Download chapter PDF

Comparative Genomics Reveal the High Conservation and Scarce Distribution of Nitrogen Fixation nif Genes in the Plant-Associated Genus Herbaspirillum

Article 06 August 2022

Comparative Genomics of Nitrogen Cycling Pathways in Bacteria and Archaea

Article 13 August 2018

Distinctive characters of Nostoc genomes in cyanolichens

Article Open access 05 June 2018

Keywords

1 Introduction

Nitrogen is an essential nutrient for all living organisms. It is an important component of many organic molecules such as DNA, RNA, and proteins, the building blocks of life. Molecular nitrogen or dinitrogen (N₂) makes up nearly four-fifths of the atmosphere but is metabolically unavailable directly to higher plants or animals. It is available to some species of microorganism through biological nitrogen fixation (BNF) in which atmospheric nitrogen is converted to ammonia by the enzyme nitrogenase (Postgate 1998). The ammonia is then transferred to the higher plant to meet its nutritional needs for the synthesis of proteins, enzymes, nucleic acids, chlorophyll, etc., and subsequently it enters the food chain. Thus, all eukaryotes (including higher plants and animals) naturally depend on the BNF activity of the N-fixing microbes for their N supply. Microorganisms that fix nitrogen are called diazotrophs. According to current knowledge, only prokaryotes (members of the domains Archaea and Bacteria) are capable of performing BNF (Klipp 2004).

The ability to fix nitrogen is widely, though paraphyletically, distributed across both the bacterial and archaeal domains (Raymond et al. 2004). There are two types of diazotrophic prokaryotes: those that are free-living (e.g., Azotobacter, Clostridium, Klebsiella, etc.) and those that form symbiotic relationships (e.g., Rhizobium, Bradyrhizobium, Frankia, etc.). The free-living diazotrophs require a chemical energy source if they are nonphotosynthetic, whereas the photosynthetic diazotrophs, such as the cyanobacteria, utilize light energy (Leigh 2002). Some diazotrophs called rhizobia enters into symbiotic relationship with legumes like clover and soyabean. The symbiosis between legumes and the nitrogen-fixing rhizobia occurs within nodules mainly on the root and in a few cases on the stem (Burns and Hardy 1975). A similar symbiosis occurs between a number of woody plant species and the diazotrophic actinomycete Frankia (Pedrosa et al. 2000). These symbiotic associations are the greatest contributions of fixed nitrogen to agricultural systems.

The exploitation of biological nitrogen fixation for agricultural benefits has long been sought after. Biological nitrogen fixation provides a means to meet the needs of a growing population with a nutritious, environmentally friendly, sustainable food supply. This makes the need for BNF research very compelling in the current scenario. In the last two decades, many exciting happenings in nitrogen fixation took place, genomes have been sequenced, the “omics” approaches have been applied to both symbionts, and new genetically modified crops are becoming commonplace in agriculture. Biochemical research into the workings of nitrogen fixation is generally focused on the enzyme complex called nitrogenase. Other than its usual function, this system has emerged as a model for more general biochemical processes, such as signal transduction, protein–protein interaction, inter- and intramolecular electron transfer, complex metal cluster involvement in enzymatic catalysis, etc. (Peters et al. 1995). It is in this perspective that we have thought of reviewing the present trends in nitrogen fixation research in context to agriculture with special emphasis on genomics, proteomics, and bioinformatics.

2 Various Aspects of Biological Nitrogen Fixation

2.1 Biological Nitrogen Fixation and Sustainable Agriculture

Natural reserves of soil nitrogen are normally low, so commercially prepared N fertilizers must be added to increase plant growth and vigor. Chemical fertilizers had a substantial impact on food production in the recent past and are today an indispensable part of modern agricultural practices. But for the farmers of developing countries, N fertilizers are neither affordable nor widely available. Moreover, the harmful effects on the environment of heavy use of N fertilizer are becoming more evident day by day. Further, the fossil fuels which are used in the production of N fertilizer are becoming scarcer and more expensive. At the same time, the demand for food is going up as populations increase. Therefore, there is a great need to search for all possible avenues. The process of biological nitrogen fixation offers economically attractive and ecologically sound means of reducing external nitrogen input and improving the quality and quantity of internal resources. Biological nitrogen fixation is the reduction of atmospheric N₂ gas to biologically available ammonium, mediated by prokaryotic organisms in symbiotic relationships, associative relationships, and under free-living conditions (Postgate 1998). The fixed nitrogen that is provided by biological nitrogen fixation is less prone to leaching and volatilization, and therefore the biological process contributes an important and sustainable input into agriculture. Nitrogen input through BNF can help maintain soil N reserves as well as substitute for N fertilizer to attain large crop yields (Peoples and Craswell 1992). An understanding of the factors controlling BNF systems in the field is vital for the support and successful adoption in large scale in an agricultural context.

Wani et al. (1995) highlighted the importance of biological nitrogen fixation of legumes in sustainable agriculture in semiarid tropical region. Legumes, one of the most important plant families in agriculture, are often involved in a remarkable symbiosis with nitrogen-fixing rhizobia. Legumes are often considered to be the major nitrogen-fixing systems, as they may derive up to 90 % of their nitrogen from N₂. The quantity of atmospheric N fixed through forage legume biological N fixation can range as high as 200 kg/ha per year (Peoples et al. 1995). The symbiotic association of actinorhizal species helps in improving soil fertility in disturbed sites such as eroded areas, sand dunes, moraines, etc. Actinorhizal plant nitrogen fixation rates are comparable to those found in legumes (Torrey and Tjepkema 1979; Dawson 1983). Nitrogen-fixing Azolla–Cyanobacteria symbiosis has been widely used to enrich rice paddies with organic nitrogen in Southeast Asian countries like China, Vietnam, and Southeast Asia (Watanabe and Liu 1992). The rice paddies of Asia, which feed over half of the world’s population, depend upon cyanobacterial N₂ fixation (Irisarri et al. 2001).

2.2 Physiological and Phylogenetic Diversity of Diazotrophs

Farmers have known, probably since the time of the Egyptians, that legumes such as pea, lentil, and clover are important for soil fertility. The practices of crop rotation, intercropping, and green manuring were extensively described by then Romans, but it was not until the nineteenth century that an explanation for the success of the legumes in restoring soil fecundity was uncovered. The discovery of nitrogen fixation was attributed to the German scientists Hellriegel and Wilfarth, who in 1886 reported that legumes bearing root nodules could use gaseous nitrogen. Shortly afterward, in 1888, Beijerinck, a Dutch microbiologist, succeeded in isolating a bacterial strain from root nodules. This isolate happened to be a Rhizobium leguminosarum strain (Franche et al. 2009). Beijerinck (in 1901) and Lipman (in 1903) were responsible for the isolation of Azotobacter spp., while Winodgradsky (in 1901) isolated the first strain of Clostridium pasteurianum (Stewart 1969). The discovery of nitrogen fixation in blue-green algae was established much later (Stewart 1969). The identification of nitrogen-fixing microbe from root nodules of nonleguminous plants like Alder generated considerable controversy for a while. It was Brunchorst who named the microbe Frankia subtilis (Pawlowski 2009). Hiltner (1898) recognized the nodule inhabitant as an actinomycete, Gram-positive bacteria closely related to Streptomyces. Pommer (1959) was probably the first person to obtain an isolate, but it did not reinfect its host plant. For a long time, diazotrophy in the actinomycetes was thought to be limited to the genus Frankia, but through the years several other actinomycetes have been shown to have nif genes (Gtari et al. 2012). Over the years there have been continual discoveries of new diazotrophs, revealing that this function is performed by a very diverse group of prokaryotes. In the last decades, the use of molecular technologies for the direct detection of the genes of biological nitrogen fixation has shown that the capacity for diazotrophy is even more widespread than previously expected.

Although nitrogen fixation is not found in eukaryotes, it is widely distributed among the Bacteria and the Archaea, revealing considerable biodiversity among diazotrophic organisms. The ability to fix nitrogen is found in most bacterial phylogenetic groups, including green sulfur bacteria, Firmibacteria, actinomycetes, cyanobacteria, and all subdivisions of the Proteobacteria. In Archaea, nitrogen fixation is mainly restricted to methanogens. The ability to fix nitrogen is compatible with a wide range of physiologies including aerobic (e.g., Azotobacter), facultatively anaerobic (e.g., Klebsiella), or anaerobic (e.g., Clostridium) heterotrophs; anoxygenic (e.g., Rhodobacter) or oxygenic (e.g., Anabaena) phototrophs; and chemolithotrophs (e.g., Alcaligenes, Thiobacillus, Methanosarcina) (Young 1992). Diazotrophs show considerable diversity in terms of habitats. They are found as free-living in soils and water, associative symbioses with grasses, actinorhizal associations with woody plants, and cyanobacterial symbioses with various plants. The most widely known and discussed feature of diazotrophs is their symbiotic association with a number of leguminous plants collectively referred to as rhizobia. The rhizobia are Gram negative and belong to the large and important Proteobacteria division and include the genera like Agrobacterium, Allorhizobium, Azorhizobium, Bradyrhizobium, Mesorhizobium, Rhizobium, Sinorhizobium, Devosia, Methylobacterium, and Ochrobactrum (Franche et al. 2009). These soil bacteria are able to invade legume roots in nitrogen-limiting environments, leading to the formation of a highly specialized organ, the root nodule. These specialized root structures offer an ecological niche for the microbe to fix nitrogen (Mylona et al. 1995). Symbiotic association is not limited to the legumes but to a number of nonlegumes. The most significant among them are the actinorhizal plants–Frankia association. The genus Frankia consists of filamentous actinomycetes forming symbiotic associations with a number of woody dicot plants like Casuarina, Hippophae, Alnus, Myrica, etc., belonging to different families (Benson and Silvester 1993). Frankia compartmentalizes nitrogenase within the vesicle structures, which are surrounded by an envelope containing a high content of bacteriohopane lipids and function to protect the enzyme from oxygen inactivation (Berry et al. 1993; Huss-Danell 1997). Over the years diazotrophy has been reported from other actinomycetes as well such as Mycobacterium flavum, Corynebacterium autotrophicum, Arthrobacter sp., Agromyces, etc. (Gtari et al. 2012). The findings of several authors (Von Bulow and Dobereiner 1975; Dobereiner 1976; Baldani and Baldani 2005) revealed existing associations of tropical grasses with nitrogen-fixing bacteria that, which under favorable conditions, may be contributing significantly to the N economy of these plants. The bacteria belong to the genus Azospirillum and are the most promising microorganisms that colonize roots of economically important grasses and cereals (Leigh 2002).

Cyanobacteria have long been known to fix nitrogen. Both heterocystous (like Anabaena, Nostoc, etc.) and nonheterocystous cyanobacteria (like Trichodesmium, Plectonema, etc.) are capable of diazotrophy (Schlegel and Zaborosch 2003). They are the only organisms that are capable of both O₂-evolving photosynthesis and nitrogen fixation (Klipp 2004). Therefore, face the unique problem of balancing two essential, but incompatible, cellular processes: oxygenic photosynthesis and O₂-sensitive N₂ fixation. In some filamentous cyanobacteria, nitrogen fixation occurs in specialized, terminally differentiated cells called heterocysts that protect the nitrogenase complex from O₂ damage by increasing respiration, terminating photosystem II activity, and forming multilayered cellular membranes that reduce oxygen diffusion, thus creating a microaerobic environment (Adams 2000). However, in members like Lyngbya, Plectonema, etc., where heterocyst is absent, nitrogen fixation occurs in internally organized cells (Schlegel and Zaborosch 2003). Another important aspect of cyanobacteria is their association with higher plants. The Anabaena–Azolla association (Bohlool et al. 1992) and Nostoc–Gunnera association (Mylona et al. 1995) can fix a substantial amount of nitrogen. Cycads in association with cyanobacterial species can also fix nitrogen (Rai et al. 2002).

2.3 Nitrogenase Complex: Enzymatic Machinery

The biochemical machinery required for biological nitrogen fixation is provided by the nitrogenase enzyme system (Eady and Postgate 1974; Hoffman et al. 2009). Nitrogenase is a two-protein component system that catalyzes the reduction of dinitrogen to ammonia coupled to the hydrolysis of ATP (Rees and Howard 2000). The most extensively studied form of nitrogenase is the molybdenum-containing system that consists of two component metalloproteins, the molybdenum–iron (MoFe) protein and the iron (Fe) protein. The smaller component of nitrogenase is the Fe protein, which acts as a redox-active agent and transfers electrons to the MoFe protein for the reduction of substrates from available electron donor in the system (Rees et al. 2005). It has two identical subunits. The Fe protein contains one iron sulfur cluster [4Fe-4S], which bridges the two subunits. The Fe protein has one MgATP-binding site in each subunit that binds to two MgATP molecules. Binding of MgATP to the Fe protein induces conformational changes followed by hydrolysis of MgATP, which facilitate the electron transfer from the Fe protein to the MoFe protein (Rees et al. 2005). Although this transfer of electrons is the main function of the Fe protein, it has some other functions. The Fe protein is needed for initial biosynthesis of the MoFe cofactor. Following the biosynthesis of MoFe cofactor, the insertion of the preformed MoFe cofactor into the MoFe protein requires the Fe protein (Burgess and Lowe 1996). The larger component of nitrogenase is the MoFe protein, which is a α₂β₂-tetramer, containing two αβ-dimer subunits. Each dimer contains one MoFe cofactor and one P-cluster [8Fe-7S]. The MoFe cofactor is located in the active site of the protein where the reduction of substrates occurs. The main role of the P-cluster is electron transfer by accepting an electron from the Fe protein and donating it to the MoFe cofactor. Each cluster contains eight metals and associated sulfurs that are arranged distinctively. The αβ-dimeric units communicate and contact each other through their subunits (Burgess and Lowe 1996). The P-cluster bridges between each α- and β-subunit, while the MoFe cofactor is placed on α-subunits. In addition to this molybdenum-containing nitrogenase, alternative nitrogenases also exist that are homologous to this system, but with the molybdenum almost certainly substituted by vanadium or iron (Eady 1996). The vanadium-nitrogenase system has two components. It has an Fe protein which is the same as other nitrogenase systems, and the second component is a vanadium–iron (VFe)-containing protein which is different compared to two other systems. This type of nitrogenase has been detected in A. vinelandii and A. chroococcum (Robson et al. 1986). The third type of nitrogenase, iron only, contains an iron (Fe) protein and another protein, which is very similar to MoFe protein and VFe protein, while it has only Fe as its cofactor. This type of protein has also been detected in A. vinelandii nitrogenase (Eady 1996).

Studies by various authors (Thorneley and Lowe 1985; Burgess and Lowe 1996) revealed that the basic mechanism of nitrogenase involves the following: (1) complex formation between the reduced Fe protein with two bound ATP and the MoFe protein, (2) electron transfer between the two proteins coupled to the hydrolysis of ATP, (3) dissociation of the Fe protein accompanied by re-reduction (via ferredoxins or flavodoxins) and exchange of ATP for ADP, and (4) repetition of this cycle until sufficient numbers of electrons and protons have been accumulated so that available substrates can be reduced. In addition to dinitrogen reduction, nitrogenase has been found to catalyze the reduction of protons to dihydrogen, as well as nonphysiological substrates such as acetylene.

2.4 Genetics and Genomics of Biological Nitrogen Fixation

The biochemical complexity of nitrogen fixation is reflected in the genetic organization and in the regulation of expression of the components required for the catalytic activity. Various techniques like mutations, deletion mapping, cloning vectors, etc., have facilitated the identification of genes associated with nitrogen fixation. The organization and regulation of the genes were revealed in the early 1980s. The organism that appears to have the simplest organization of nitrogen fixation-specific (nif) genes, and which is the one best studied at the molecular genetics level, is the facultative anaerobe, Klebsiella pneumoniae. Arnold et al. (1988) reported the first ever detailed organization of nif genes from this organism. A 24 kb base pair DNA region contains the entire K. pneumonia nif cluster, which includes 20 genes. nifHDK are the three structural genes encoding for the three subunits of Mo nitrogenase. In most nitrogen-fixing prokaryotes, these three genes form one transcriptional unit, with a promoter in front of the nifH gene. A number of studies (Dixon et al. 1980; Paul and Merrick 1989; Rubio and Ludden 2005, 2008) have established that the maturation of apo-Fe protein (NifH) requires the products of nifH, nifM, nifU, and nifS, while that of apo-MoFe protein requires at least six genes nifE, nifN, nifV, nifH, nifQ, and nifB which are required for the biosynthesis of FeMoco. There is considerable homology between nifDK and nifEN, and it has been speculated that the nifEN products might form a scaffold for FeMoco biosynthesis that later shifts FeMoco to the nifDK complex (Brigle et al. 1987). Imperial and his coworkers (1984) established that the nifQ gene product might be involved in the formation of a molybdenum–sulfur precursor to FeMoco. Mutations in nifB result in the formation of an immature MoFe protein that lacks FeMo cofactor. It can be activated in vitro by adding FeMo cofactor that has been isolated from wild-type MoFe protein (Roberts et al. 1978). Mutations in the nifV gene result in the formation of a nitrogenase with a bound citrate rather than homocitrate. The nifV product is homocitrate synthase (Zheng et al. 1997). Thus, on the basis of mutational studies, the function of various other nif genes has been confirmed. In contrast to Klebsiella, the nif organization is a bit complex different in Azotobacter vinelandii. In Azotobacter the genes coding for the Mo-dependent nitrogenase components (nifHDK) and their regulatory and assembly systems are located in two discrete regions (O’Carroll and Dos Santos 2011). The organization of nitrogen-fixing genes along with their genetic regulation in different rhizobia was extensively reviewed by Fischer (1994), and according to him rhizobial nif genes are structurally homologous to the 20 K. pneumoniae nif genes, and it is inferred that a conserved nif gene plays a similar role in rhizobia as in K. pneumoniae.

Besides the nif genes, the “fix”- and “nod”-type genes are associated with biological nitrogen fixation and nodule formation in rhizobial species, and many do not have homologues in the free-living diazotroph like K. pneumonia. The fix genes represent a very heterogeneous class including genes involved in the development and metabolism of bacteroides. Studies by Anthamatten and Hennecke (1991) and Batut et al. (1991) have established that fix L, fixJ, and fixK genes encode regulatory proteins. The fixABCX genes code for an electron transport chain to nitrogenase (Fischer 1994). Mutations in any one of the fixABCX genes of S. meliloti, B. japonicum, and A. caulinodans completely abolish nitrogen fixation. All four fixGHIS gene products are predicted to be transmembrane proteins, but further biochemical analysis is required to define their function in rhizobial nitrogen fixation (Fischer 1994). The fixNOQP genes encode the membrane-bound cytochrome oxidase that is required for the respiration of the rhizobia in low-oxygen environments (Delgado et al. 1998). Johnston and his coworkers discovered the presence of nodulation genes in a plasmid of Rhizobium leguminosarum and mutation of those genes rendered them useless. Later on studies (Schultze and Kondorosi 1998; Perret et al. 2000) ascertained that nod, nol, and noe genes produce nodulation signals. The interplay of different nod genes, triggering of the creation of root nodule, signaling cascades, and development of nodule meristem were reported by a number of researchers (Yang et al. 1999; Long 2001; Geurts and Bisseling 2002). In most species, the nod ABC genes are part of a single operon. Inactivation of these genes abolishes the ability to elicit any symbiotic reaction in the plant (Long 1989). Over the years other nod genes like nodD, nodEF, nodS, nodL, and nodHPQ have been characterized in many rhizobia. Like the rhizobia, Azospirillum includes a megaplasmid and sequences similar to nod genes (Elmerich 1984). Frankia on the other hand houses a number of nif genes, but researchers failed to spot nod genes in Frankia (Ceremonie et al. 1998).

Understanding of genetic machinery behind biological nitrogen fixation attained new heights with the arrival of complete genome sequences of various diazotrophs. Recent advances in genome sequencing have opened exciting new perspectives in the field of genomics by providing the complete gene inventory of rhizobial microsymbionts. Genomics have enabled thorough analysis of the gene organization of nitrogen-fixing species, the identification of new genes involved in nitrogen fixation, and the identification of new diazotrophic species. Mesorhizobium loti strain MAFF303099 (Kaneko et al. 2000) was the first sequence of a symbiotic bacterium, and it was followed by Sinorhizobium meliloti (Puhler et al. 2004). The completion of the genomes of Rhizobium leguminosarum bv viciae (Young et al. 2006), Rhizobium etli (Gonzalez et al. 2006), Bradyrhizobium strains, and Frankia strains (Normand et al. 2007) and sequences for a number of free-living diazotrophs spanning different habitat and ecological niches bolstered nitrogen fixation. The genome information from all these nitrogen-fixing organisms allows researchers to rapidly apply information obtained from genome sequencing to the developing area of functional genomics, which will provide new insights into the complex molecular relationships that support both symbiotic and nonsymbiotic nitrogen fixation. DNA array technologies are now being used to monitor the expression of a whole genome in a single experiment. The first massive approach to transcriptional analyses of the entire symbiotic replicons was based on a high-resolution transcriptional analysis of the symbiotic plasmid of Rhizobium sp. NGR234 (Perret et al. 1999) at the Universite de Geneve, which developed methods to study the regulation of bacterial genes during symbiosis. The transcriptome for S. meliloti has been examined under a variety of conditions, including in planta (Ampe et al. 2003; Berges et al. 2003). Functional gene arrays or GeoChips are also being utilized for high-throughput analysis of microbial communities involved in nitrogen fixation. Xie et al. (2011) have utilized GeoChip-based analysis to screen out functional genes associated with N-cycle in extreme environment like acid-mine drainage.

3 The Application of Bioinformatics in BNF Research

As we enter into the post-genomics era, the bioinformatics tools have emerged as important means in research of biological nitrogen fixation. Large-scale genome projects have resulted in the availability of tremendous amount of biological data. This data includes information about genomes which in turn gives the idea about proteins, codon usage, etc. With the current deluge of data, computational methods have become indispensable to biological investigations. The development of bioinformatics and statistical genetics has resulted in the production of a number of tools, which are used to annotate the genome and obtain productive information from them (Hogeweg 2011). Originally developed for the analysis of biological sequences, bioinformatics now encompasses a wide range of subject areas including structural biology, genomics, and gene expression studies.

One of the primary applications of bioinformatics is the organization of the biological data in database that allows researchers to access existing information with ease. Open-access databases like GenBank, EMBL, and DDBJ now house thousands of nifH and nifD sequences. The numbers of fully sequenced and assembled diazotrophic genomes deposited in the databases have also gone up in the last few years. Simultaneously, new databases exclusively devoted to various aspects of biological nitrogen fixation like NodMutDB (Nodulation Mutant Database) (Mao et al. 2005), RhizoGATE (Becker et al. 2009), RhizoBase (http://genome.kazusa.or.jp/rhizobase/), etc., have also surfaced in recent years. EST programs conducted in the model legume M. truncatula have led to the development of databases that allow data mining to identify genes relevant for nitrogen-fixing symbioses, for example, the TIGR M. truncatula Gene Index (http://www.tigr.org/tdb/mtgi) (Quackenbush et al. 2000), the M. truncatula database MtDB2 (http://www.medicago.org), and the database of the Medicago Genome Initiative (Bell et al. 2001). The data present in the various databases can be analyzed and interpreted in a biologically meaningful manner with the aid of computational tools.

Nowadays, the rapid increase in the number of prokaryotic species with sequenced genomes enables the development of in silico searching tools to identify complex biochemical pathways such as nitrogen fixation. Such assumptions, although very accurate, yield putative results and do not obviate the need for genetic and biochemical confirmation of gene function. Computation prediction tools like BLAST (Basic Local Alignment Search Tool) are being used by researchers for examining the occurrence and distribution of nitrogen fixation genes. The genomes present in the database are being scanned using NifHDK as query sequence (O’Carroll and Dos Santos 2011). Phylogenies for the major nif operon genes have been inferred by distance matrix-based methods like neighbor-joining or UPGMA or maximum likelihood-based methods in an attempt to understand the timing and complex genetic events that have marked the history of nitrogen fixation (Raymond et al. 2004). Computational tools are also now routinely employed by researchers (Amadou et al. 2008; Carvalho et al. 2010; Peralta et al. 2011; Black et al. 2012) to compare the entire genomes of diazotrophs, which permits the study of more complex evolutionary events, such as gene duplication, horizontal gene transfer, and the prediction of factors important in bacterial speciation. Comparative genomics of Frankia yielded vital information regarding their evolutionary history and linked the inconsistency of genome size with the biogeographic history of the host plants harboring the microbial strains (Normand et al. 2007). Systems biology is another area where computer-based simulation has been used extensively to analyze and visualize the complex connections and circuits of cellular pathways such as nitrogen fixation. Zhao and his colleagues (2012) used several in silico tools for the reconstruction of metabolic network involved in symbiotic nitrogen fixation in S. meliloti 1021. It provided a knowledge-based framework for better understanding the symbiotic relationship between rhizobia and legumes. The nifH gene is the most widely sequenced marker gene used to identify nitrogen-fixing Bacteria and Archaea. Many PCR primers have been developed to target the nifH gene with the purpose of amplifying this gene sequence. Various program tools like Primer designer, PrimerSelect, Primer3, etc., are now available which assist in designing these primers and evaluating the primer through e-PCR (Schuler 1997). Recently Gaby and Buckley (2012) made a thorough in silico evaluation of the various nifH primers.

Bioinformatics is also indispensable for the examination of the data obtained in proteome analysis. An excellent resource of Internet-accessible proteome databases is the Expert Protein Analysis System (ExPASy), available online at http://www.expasy.ch/ (Gasteiger et al. 2003). Furthermore software packages have been developed that can take multiple protein-expression profiles and automatically identify quantitative changes of interest. Two-dimensional electrophoresis databases are accessible on the Internet and can be browsed with interactive software and integrated with in-house results. A cluster of Orthologous Groups of proteins (COG) is a new database search and represents an attempt at a phylogenetic classification of proteins from complete genomes (http://www.ncbi.nlm.nih.gov/COG) (Tatusov et al. 2000). It is to serve as a platform for functional annotation of newly sequenced genomes and for studies on genome evolution. In addition, the identification of domains as subsets of proteins has been a very promising approach, implemented by databases such as InterPro (http://www.ebi.ac.uk/interpro/). Proteomic analysis has revealed the direct genome functionality in a number of diazotrophic genomes (MacLean et al. 2007). Smit and coworkers (2012) have used various proteomics approaches along with bioinformatics tools for proteomic phenotyping of Novosphingobium nitrogenifigens, a free-living diazotroph. Rapid developments of technological expertise in proteomics coupled with the improvement of in silico tools have resulted in a deluge of structural information that guarantees acceleration in nitrogen fixation research.

As we march into the new millennium, practical application of computation tools to decipher meaningful information from available data is inevitable. Some of the important in silico tools used for research in various aspects of biological nitrogen fixation is mentioned in Table 1. Bioinformatics has the potential to elevate the research on biological nitrogen-fixing bacteria and its protein machinery to a next level. The availability of bioinformatics tools has provided an opportunity to focus on the comparative genomics, molecular evolution of the genomes along with conformational and structural details of the proteins involved. Structural studies of proteins will provide a better understanding of the functional evolution of diazotrophy.

Table 1 Some bioinformatics tools used for research in biological nitrogen fixation

Full size table

3.1 Research Trends in Codon Usage Analysis and Comparative Genomics

In the post-genomics era, the application of bioinformatics tools in comparative genomics has led to the belief that every genome has its own story. Particularly the genetic code and its usage preferences are one of the most interesting aspects of biological science. In the early period, majority of work on codon usage patterns focused upon E. coli (Peden 1999). Gradually the bioinformatics analysis of codon usage was applied upon mammalian, bacterial, bacteriophage, viral, and mitochondrial genes. Sharp and Li (1987) were the pioneers in developing the Codon Adaptation Index (CAI) to assess the similarity amid the synonymous codon usage of a gene to that of the reference set. Besides CAI, several indices such as GC content, GC3 content, effective number of codons (Nc) (Wright 1990), relative synonymous codon usage (RSCU) (Sharp et al. 1986), Codon Bias Index (CBI), and Fop (frequency of optimal codons) (Ikemura 1985) are very significant in studies concerning codon usage patterns. Very preliminary work on codon usage of nitrogen-fixing diazotrophs was initiated by Mathur and Tuli (1991). Ramseier and Gottfert (1991) reported differences in codon usage and GC content in Bradyrhizobium genes. Moderate codon bias was attributed to translational selection in nitrogen-fixing genes of Bradyrhizobium japonicum USDA 110 (Sur et al. 2005). The analysis of synonymous codon usage patterns of three Frankia genomes (strains CcI3, ACN14a, and EAN1pec) revealed that codon usage was highly biased, but variations were noticed among the three strains(Sen et al. 2008). Using Codon Adaptation Index (CAI), highly expressed genes in Frankia were predicted. Synonymous codon usage analysis in Azotobacter vinelandii divulged considerable amount of heterogeneity (Sur et al. 2008). About 503 potentially highly expressed genes were identified, and most of them were linked to metabolic functions of which 10 were associated with the core nitrogen-fixing mechanism. Sen et al. (2012) explored the role of rare TTA codon in the genome of diazotrophic actinomycetes Frankia.

Other than codon usage, molecular evolution of genes is another aspect which needs to be investigated. A more reliable index of genetic drift over evolutionary time is the ratio of Ka (nonsynonymous substitutions per site) to Ks (synonymous substitutions per site) for a large set of genes, based on the comparisons of related species. The Ka to Ks ratio, which is almost always less than one, is widely used as an indicator of the extent of purifying selection acting to conserve coding sequences. This parameter has been widely applied in the analysis of adaptive molecular evolution and is regarded as a general method of measuring the rate of sequence evolution in biology. Program packages like PAML (Yang 1997) have been extensively used for the estimation of nucleotide substitution rates based on phylogenetic analysis by maximum likelihood (ML). Ka/Ks parameters have been used to assess the molecular evolution of in plant hemoglobin genes (Guldner et al. 2004), secretory protein genes in Streptomyces and yeast (Li et al. 2009b), and in various disease-causing genes. Among diazotrophs, Crossman et al. (2008) measured the rates of synonymous (Ks) and nonsynonymous substitutions (Ka) in orthologous genes of R. etli and R. Leguminosarum. More recently, synonymous and nonsynonymous substitution rates of orthologs shared by five species of Rhizobiales, three plant symbionts, one plant pathogen, and one animal pathogen have been calculated by Peralta et al. (2011). Apart from the whole genome, molecular evolution of the genes responsible for symbiotic association and nodulation such as nodule-specific genes (Yi 2009) and recently SymRK (Mahe et al. 2011) has been specifically analyzed. But still a lot of symbiotic genes from wide range of diazotrophs have still to be analyzed to gather a complete scenario of their evolutionary rate in terms of their sequence features. Accumulations of bacterial whole genome sequences also give the biologists more opportunities to explore and compare the genomes in larger scale. Comparative genomics has given rise to a new concept highlighting the great diversity between closely related strains. A species can be described by its pan-genome, i.e., the sum of a core genome containing genes present in all strains, and a dispensable genome, with genes absent from one or more strains and genes unique to each strain (Medini et al. 2005). Studying the diversity within pan-genomes is of interest for the characterization of the species or genus. Low pan-genome diversity could be reflective of a stable environment, while bacterial species with substantial abilities to adapt to various environments would be expected to have high pan-genome diversity (Snipen and Ussery 2010). In 2005, Tettelin and colleagues introduced the conception of “pan-genome” in Streptococcus agalactiae (Tettelin et al. 2005). Soon afterward, pan-genome has been widely used to provide insight into the analysis of the evolution of S. pneumonia (Hiller et al. 2007), H. influenza (Hogg et al. 2007), E. coli (Rasko et al. 2008), and so on. Besides evolution, pan-genome has been widely used to detect strain-specific virulence factors for some pathogens, L. pneumophila (D’Auria et al. 2010). Recently symbiotic pan-genome of the nitrogen-fixing bacterium Sinorhizobium meliloti has been explored using computational methods, and a set of accessory genetic factors related to the symbiotic process have been defined (Galardini et al. 2011). As complete nucleotide sequences of more chromosome and symbiotic plasmids of nitrogen-fixing organisms become available, we have entered into the phase of comparative genomics. Comparative genomics also enables a much deeper understanding of the origin and evolution of free-living and symbiotic nitrogen fixation. Comparative genomics approach has been utilized by Carvalho et al. (2010) to delineate the evolutionary characterization of diazotrophic and pathogenic bacteria of the order Rhizobiales. Black et al. (2012) have worked upon 14 strains of Rhizobiales to investigate the feasibility of defining a core “symbiome.” The authors’ group is currently engaged in comparative genomics of nitrogen-fixing actinomycetes, Frankia, and members of Rhizobiales using CMG-Biotools – a platform for comparative genomics. The proteomes are compared with BLASTP using the “50/50” rule, i.e., BLASTP hit was considered significant if the alignment produced at least 50 % identity for at least 50 % of the length of the longest gene (either query or subject). The BLAST results are visualized in a BLAST matrix, which summarizes the results of genomic pairwise comparisons. One such BLAST matrix produced for five Frankia strains is presented in Fig. 1. The comparison of these whole genomes has revealed valuable information, such as several events of lateral gene transfer, particularly in the symbiotic plasmids and genomic islands that have contributed to a better understanding of the evolution of contrasting symbioses.

3.2 Bioinformatics Approaches for the Characterization of Proteins Related to BNF

Apart from the sequence-based analysis and comparative genomics, the structural biology is one such field which has been hugely benefitted by bioinformatics tools. Structural analyses include protein and nucleic acid structure prediction, comparison, classification, and assessment of structure–function relationship. Often it is seen that structural analysis in turn depends on the results of sequence analysis. For example, protein structure prediction depends on sequence alignment data. Thus the two aspects of bioinformatics analysis are not isolated but often interact to produce integrated results.

Developments in the field of proteomics have resulted in availability of large amount of biological data in the public domain. This data includes amino acid sequences of nitrogenase proteins from a wide range of microbes. However, very little is known about the structure and role of all these proteins. Two technologies, X-ray and NMR, are by far the two most common means used to determine protein structure experimentally. In 1992, Kim and Rees (1992) provided a detailed crystallographic structure of molybdenum–iron protein of the Azotobacter vinelandii nitrogenase. The crystal structure of nitrogenase molybdenum–iron protein has also been described from Clostridium pasteurianum (Kim et al. 1993). The X-ray crystal structure of Klebsiella pneumoniae nitrogenase component 1 (Kp1) has also been determined and refined to a resolution of 1.6 Å (Mayer et al. 1999). The 2.9 Å crystal structure of the NifH protein from Azotobacter vinelandii was obtained by Georgiadis et al. (1992). However, tertiary structures of large number of nitrogenase proteins from different diazotrophs particularly those of symbiotic ones have not yet been resolved. The exact mechanism of working of these proteins is also relatively unknown due to the difficulty in obtaining crystals of nitrogen bound to nitrogenase. This is because the resting state of MoFe protein does not bind nitrogen. Moreover, in the recent years, quite a number of discrepancies have also crept out regarding the protein structures resolved by X-ray crystallography leading to retraction of papers (Chang et al. 2006). In this regard, a viable alternative approach is to predict 3D structure of proteins based on homology modeling technique and validate it properly. Homology modeling is a reliable technique that can consistently predict the 3D structure of a protein with precision akin to one obtained at low resolution by experimental means (Marti-Renom et al. 2000). This technique depends upon the alignment of a protein sequence of unknown structure (target) with that of a homologue of known structure (template). This technique is particularly quite important in organisms with slow growth rate which poses difficulties in the purification of subsequent proteins. Browne et al. (1969) published the first report on homology modeling. A model of α-lactalbumin was constructed by taking the coordinates of a hen’s egg-white lysozyme and modifying, by hand, those amino acids that did not match the structure. Since the mid-1980s, a large number of homology models of proteins with different folds and functions have been reported in the literature (Johnson et al. 1994; Sali 1995). Homology modeling approaches were first applied for structural analysis of nitrogenase iron protein from Trichodesmium sp., a marine filamentous nitrogen-fixing cyanobacteria (Zehr et al. 1997a). Standard homology modeling approaches have also been used to generate reliable models of the nitrogenase Fe protein from thermophilic Methanobacter thermoautotrophicus based on the structure of the Azotobacter vinelandii nitrogenase Fe protein (Sen and Peters 2006). The authors’ group has been involved in the determination of 3D-structure NifH protein from various diazotrophs like Frankia (Sen et al. 2010) and Bradyrhizobium ORS 278 (Thakur et al. 2012) using homology modeling technique. The model of NifH of Frankia (Fig. 2) was based on the template protein which was a nitrogenase iron protein from Azotobacter vinelandii. The structure is reliable offering insights into the 3D structural framework as well as structure–function relationship of NifH protein. The models based on homology are quite useful in providing conformational properties and structure–function relationship of these proteins.

A number of aspects of nitrogenase, particularly structure–function relationships, are interesting areas of fundamental research. The three-dimensional structure of protein like that of nitrogenase is often considered an ideal model system for the study of the complex metal cluster-mediated catalysis, electron transfer, complex metal cluster assembly, protein–protein interactions, and nucleotide-dependent signal transduction. Molecular dynamics simulations offer details about molecular motions as a function of time and are widely used to study protein motions at the atomic level. First protein simulation for 9.2 ps was carried out by McCammon et al. (1977) for bovine pancreatic trypsin inhibitor (BPTI) (McCammon et al. 1977). Case and Karplus work on dynamics of ligand binding to heme protein in 1979 is arguably the first simulation of ligand moving through the protein (Case and Karplus 1979). First application of normal modes to identify low frequency oscillations using the energy minimization of the molecular mechanics force field of protein was described by Brooks and Karplus (1983). This is the basic technique to identify domain-level motions in a protein. First simulation of a protein in explicit waters was done by Levitt and Sharon (1988).

Metalloproteins like nitrogenase are a vast class of biological molecules, which are responsible for many vital functions. Despite the intrinsic difficulties of these systems particularly those related to parameterization of the metal cofactors, they have been the object of several MD simulations. These studies are mainly focused on structural aspects, since the cluster either has a storage role or is involved in an electron-transfer process in these proteins. Among the metalloprotein having FeS cluster cofactor, molecular dynamics simulation has been carried for protein like heme-containing cytochrome P450 (Kuhn et al. 2001), Rubredoxins (Grottesi et al. 2002), 3Fe–4S cluster-containing protein, ferredoxin I (Meuwly and Karplus 2004), adenosine phosphosulfate reductase (dos Santos et al. 2009), and hydA1 hydrogenase (Sundaram et al. 2010). More recently, molecular modeling, dynamics, and docking studies on both A. vinelandii and G. diazotrophicus FeSII proteins and nitrogenases were carried out by Lery et al. (2010), elucidating molecular aspects of protein–protein interaction. In the MD simulation of metalloproteins, the force field parameters of the metal ion and its ligands need to be defined beforehand taking into account the nature of the metal ion, its coordination number, geometry, oxidation, and spin states and the nature of its ligands. Several sets of parameters have been reported in the literature for the active sites of the most widely studied metalloproteins including the coordination geometries of the metal ligand (Banci and Comba 1997; Norrby and Brandt 2001; Comba and Remenyi 2002). One of the parameters that significantly affect the overall protein structure is the partial charges of the atoms of the metal–ligand moiety. In the bonded model, partial charges are commonly calculated through the RESP (Restrained Electrostatic Surface Potential) methodology (Fox and Kollman 1998) applied to semiempirical or ab initio calculations. The ab initio calculations are mostly performed through density functional theory (DFT) calculations, with the B3LYP functional or Hartree–Fock calculations (Banci 2003). Thus, the development of proper parameters of the metal cofactors needs the amalgamation of quantum calculations in conjunction with classical molecular mechanics calculations. This will enable the description of not only structural features but also of reactivity properties of the metalloproteins.

3.3 Tracing the Evolution of BNF Through Bioinformatics

3.3.1 Classical Approach

Researchers have long sought to answer the question of when nitrogen fixation began and what evolutionary pressures affected it (Postgate and Eady 1988; Berman-Frank et al. 2003). The emergence and evolution of nitrogen fixation ability (diazotrophy) among prokaryotes is complex and has not yet been fully elucidated. The incomplete distribution pattern of this highly conserved enzyme among Bacteria and Archaea has led to the development of conflicting hypotheses on BNF. The first idea theorizes that nitrogen fixation is an ancient function of the last common ancestor of Bacteria–Archaea that was vertically transmitted, but has undergone widespread gene loss among descendants with horizontal transfer in some isolated instances (Hennecke et al. 1985; Normand and Bousquet 1989; Fani et al. 2000; Berman-Frank et al. 2003). During this postulated time period, reduced nitrogen may have been very abundant and the initial function of nitrogenase was probably very different. One proposed initial function of ancient nitrogenase might be associated with detoxification mechanism for cyanides and other chemicals (Silver and Postgate 1973; Fani et al. 2000). This idea is based on the observation that nitrogenase reduces a number of alternative substrates in addition to N₂, several of which are toxins (e.g., cyanides). The second hypothesis proposes that nitrogen fixation was an anaerobic ability that appeared after the emergence of oxygenic photosynthesis and was subsequently lost in most lineages through horizontal transfer (Postgate 1982; Postgate and Eady 1988). Recently, Hartmann and Barnum (2010) examined Mo-nitrogenase phylogeny and proclaimed a conclusion combining both theories on diazotrophic evolution.

Nitrogenase genes are highly conserved at both the chemical and genetic levels across wide phylogenetic ranges and among closely related organisms. The conservation of nitrogenase genes lends itself for use as a genetic marker for phylogenetic analysis to help answer questions of the evolution of nitrogen fixation and its genes. Raymond et al. (2004) reported that nitrogenase evolved in multiple lineages, and there are evidences of loss, duplications, and horizontal and vertical transfers for the nitrogenase genes and operons during the course of evolution. nifD and nifK are thought to be the result of an in-tandem gene duplication (Fani et al. 2000; Postgate and Eady 1988), giving the functional components of the enzyme. A second duplication event is thought to have occurred for the nifEN genes. Till date most of the studies concerning the evolution of nitrogen fixation have focused on the nif genes, primarily the highly conserved nifH gene but also the larger but less conserved nifD, nifK, nifE, and nifN genes (Normand and Bousquet 1989; Normand et al. 1992; Hirsch et al. 1995; Fani et al. 2000). Sequence alignment-based methods are widely used to study the evolution of relevant nif genes. Young (2005) discussed the phylogeny and evolution of nitrogenases in details. According to Young, true NifH proteins can be divided into three types – Type B (“bacterial”) is the best represented and includes enzymes from the proteobacteria, cyanobacteria, and firmicutes; Type C (“clostridial”) is found in the firmicute bacterium and Clostridium, the green sulfur bacterium Chlorobium, and also in the archaeon Methanosarcina; and Type A is associated with the “alternative” nitrogenases that do not contain molybdenum and is found in both archaea and proteobacteria. There are also a large number of more distant relatives, notable among them light-independent protochlorophyllide (Pchlide). The similarity between these proteins and NifH was analyzed and discussed by Burke et al. (1993), who argued that nitrogen fixation probably originated before photosynthesis, so the photosynthesis enzymes would have been derived from NifH rather than the other way round. The phylogenies of NifDKEN family have also been topic of many research works. Dedysh et al. (2004) utilized the NifD phylogeny to assess the nitrogen fixation capabilities of methanotrophic bacteria. Henson et al. (2004b) reexamined the phylogeny of nitrogen fixation by analyzing only the molybdenum-containing nif D gene from a cyanobacteria, proteobacteria, as well as Gram-positive bacteria. The strict requirement of NifH in biological nitrogen fixation and its universal presence in diazotrophs has resulted in this protein serving as a sequence tag or barcode for the identification of nitrogen fixers. Genomic analysis using the sequence of NifH as a query results in BLAST hits that include NifH, VnfH, and AnfH components of the Mo-, V-, and Fe-only nitrogenases, respectively (Raymond et al. 2004). Recently, Dos Santos and his colleagues proposed a new criterion for computational prediction of nitrogen fixation: the presence of a minimum set of six genes coding for structural and biosynthetic components, namely, NifHDK and NifENB (Dos Santos et al. 2012). Latysheva et al. (2012) considered the various nif orthologs for performing empirical Bayesian ancestral state reconstructions to investigate the evolution of nitrogen fixation in cyanobacteria.

Over the years, there has been a debate among the workers regarding horizontal gene transfer (HGT) versus vertical descent as the dominant force in the evolution and distribution of N fixation. In the case of an early origin and subsequent vertical descent of the nif genes, a comparison of SSU ribosomal phylogeny and the phylogeny of nif genes should reveal roughly the same features, assuming that the mutation rates in both genes were similar. In the case of a late development and a mainly horizontal distribution of the genes, the phylogeny of the nif genes should deviate significantly from the rRNA-based standard tree. A number of researchers have presented strong evidence that SSU rRNA phylogeny and phylogeny based on the nif genes are in general agreement, suggesting that they have evolved in a similar fashion (Hennecke et al. 1985; Young 1992; Zehr et al. 1997b). However, numerous studies have highlighted instances of possible horizontal gene transfer in nifD (Parker et al. 2002; Qian et al. 2003; Henson et al. 2004a, b), nifH (Normand and Bousquet 1989; Hurek et al. 1997; Cantera et al. 2004; Dedysh et al. 2004), and nifK (Kessler et al. 1997) based on incongruence with 16S rRNA trees. Other studies have found support for both vertical descent and horizontal transfer (Hirsch et al. 1995). Haukka et al. (1998) proposed that horizontal gene transfer may have played an increasing role at genus and lower taxonomic levels. This may be especially important in organisms that have nif genes located on plasmids (Normand and Bousquet 1989).

3.3.2 Alternative Approaches

For tracing evolution of proteins within a set of divergently evolved proteins, it is useful to construct the phylogenetic trees based on the similarities in the amino acid sequences and the base sequences of the genes. But previous studies seem to suggest that the origin and extant distribution of nitrogen fixation is perplexing from a phylogenetic perspective, largely because of factors that confound molecular phylogeny such as sequence divergence, paralogy, and horizontal gene transfer (Raymond et al. 2004). This leads to the assumption that sequence-based phylogeny is not enough to reveal the complex evolutionary path in BNF. Moreover, many workers (Nadler 1995; Qi et al. 2004; Sims et al. 2009) have also pointed out fallacies in sequence alignment-based methods. Therefore substitute phylogenetic approaches are being sought. Alignment-free condensed matrix method relying on nucleotide triplet is one such alternative approach. The condensed matrix method of studying molecular phylogeny takes into account a set of invariants in a DNA sequence and determines the extent of resemblance among DNA sequences using the invariants (Randic et al. 2001). In the condensed matrix method, all the possible triplets of the nif genes were calculated and matrices were formed by using all the possible triplet. Then leading eigenvalues of these matrices were calculated. The eigenvalues were later used for the construction of distance matrices and consequently for tree construction. This approach has been utilized by researchers in phylogenetic analysis of aminoacyl t-RNA synthetase (Mondal et al. 2008), swine flu genomes (Sur et al. 2010), bacterial zeta toxin (Mondal et al. 2011), and nitrogenase proteins (Sur et al. 2010). A cladogram showing the evolution of nifH gene in various diazotrophs constructed by the condensed matrix method is presented in Fig. 3. In the phylogram, the placement of Frankia ACN14a away from the other actinobacteria and Synechococcus sp. JA-3-3Ab being isolated from rest of cyanobacterial strains is apparently quite interesting. Members of various classes of Proteobacteria (alpha, beta, gamma, and delta) are clustered together in the triplet-based phylogenetic tree. Mottled distribution of cyanobacteria is an indication of their polyphyletic origin. Thus, condensed matrix method-based phylogeny is apparently a suitable method for explaining the complex events marking the nitrogen fixation evolution.

Another suitable alternative of protein sequence alignments is the structure-based phylogeny. It is well known that the 3D structures and structural features of homologous proteins are conserved better than their amino acid sequences (Chothia et al. 1986; Hubbard and Blundell 1987). It has been demonstrated several times that the homologous proteins could diverge beyond recognition at the level of their amino acid sequences but maintain similar structure and function. In several cases of low sequence similarity, proteins retain the fold as well as retain the broad biochemical features and/or functional properties, suggesting an evolutionary connection (Murzin et al. 1995; Russell and Sternberg 1996). Previous studies (Balaji and Srinivasan 2001) have shown that in cases of poor sequence identity, structure-based phylogenies generate better models of evolution of proteins than the traditional sequence-based methods. Hence, it is more appropriate to use similarities in 3D structure of proteins in modeling evolution of distantly related proteins. The construction of phylogenetic trees using 3D structures has been applied for a variety of protein families like short-chain alcohol dehydrogenases (Breitling et al. 2001) and metallo-β-lactamases (Garau et al. 2005). Lately 3D structure-based phylogenetic approach has been utilized for functional characterization of proteins with cupin folds (Agarwal et al. 2009). It was revealed that structure-based clustering of members of cupin superfamily reflects a function-based clustering. Moreover, the comparison of distance matrices utilized in phylogenetic tree construction methods has been considered as an equivalent of comparison of phylogenetic trees based on protein structures (Balaji and Srinivasan 2001; Pazos and Valencia 2001). Therefore, such structure-based approaches can be utilized to assess the phylogenetic relationships of proteins involved in BNF which shares low sequence similarity but high structural resemblance with many proteins with diverse biological functions.

Along with the trajectory of evolution of diazotrophy in various organisms, another feature that needs attention is the functional divergence of the proteins involved in this biological process. Previous workers (Gu 1999; Dermitzakis and Clark 2001; Raes and Van de Peer 2003) have shown that gene duplication events often lead to a shift in protein function from an ancestral role resulting in divergence and as a consequence of which some residues are subjected to altered functional constraints. This implies that evolutionary rates at these sites will vary in different homologous genes of a gene family. Site-specific altered functional constraint (or shifted evolutionary rates) can be detected by comparing the rate correlation between gene clusters, when the phylogeny is given (Gu 1999). This approach has been earlier exploited by researchers to trace the functional divergence in vertebrate hemoglobin (Gribaldo et al. 2003), G-protein alpha subunits (Zheng et al. 2007), OPR gene family in plants (Li et al. 2009a), and anoctamin family of membrane proteins (Milenkovic et al. 2010). However, a broad picture on the functional divergence in the NifH/Bchl protein family is still unavailable.

4 Challenges and Future Prospects

Considerable progress has been made in understanding the machinery of biological nitrogen fixation in the last few decades. The major part of the research has been focused on the structure of nitrogenase, elucidation of the compositions, and functions of all of the nif-gene products. In the past, major roadblocks in the BNF research were the struggle associated with the detection of nif genes from environmental samples and subsequent crystallization of the nitrogenase enzymes. In the post-genomics era, these hurdles have largely been removed with the advent of metagenomic research and in silico protein modeling techniques. The challenge now is to put all the known information together and, with the combined application of biochemical, genetics, and bioinformatics techniques, to determine how nitrogenase functions at the molecular level. With the rapid increase in the number of complete genomes of varied diazotrophs along with their nitrogen-fixing genes in the public domain, bioinformatics tools have emerged as a potent weapon to tackle the unsolved mysteries of symbiotic and asymbiotic nitrogen fixation. It can be used to extract meaningful interpretation of sequence data. With the advent of new algorithms and computational tools for measuring structural divergence, the problems associated with functional evolution of nitrogenase system can also be tackled in a better way and new glimpses can be gained. Genomic studies aided by the bioinformatics tools offer a global view of the expression, regulation, dynamics, and evolution of the genomes from nitrogen-fixing microbes and have the capability in offering new opportunities to preserve and improve biotic resources.

References

Adams D (2000) Heterocyst formation in cyanobacteria. Curr Opin Microbiol 3:618–624
CAS PubMed Google Scholar
Agarwal G, Rajavel M, Gopal B, Srinivasan N (2009) Structure-based phylogeny as a diagnostic for functional characterization of proteins with a cupin fold. PLoS One 4:e5736
PubMed Central PubMed Google Scholar
Amadou C et al (2008) Genome sequence of the beta-rhizobium Cupriavidus taiwanensis and comparative genomics of rhizobia. Genome Res 18:1472–1483
CAS PubMed Central PubMed Google Scholar
Ampe F, Kiss E, Sabourdy F, Batut J (2003) Transcriptome analysis of Sinorhizobium meliloti during symbiosis. Genome Biol 4:R15
PubMed Central PubMed Google Scholar
Anthamatten D, Hennecke H (1991) The regulatory status of the fixL- and fixJ-like genes in Bradyrhizobium japonicum may be different from that in Rhizobium meliloti. Mol Gen Genet 225:38–48
CAS PubMed Google Scholar
Arnold W, Rump A, Klipp W, Priefer UB, Puhler A (1988) Nucleotide sequence of a 24,206-base-pair DNA fragment carrying the entire nitrogen fixation gene cluster of Klebsiella pneumoniae. J Mol Biol 203:715–738
CAS PubMed Google Scholar
Balaji S, Srinivasan N (2001) Use of a database of structural alignments and phylogenetic trees in investigating the relationship between sequence and structural variability among homologous proteins. Protein Eng 14:219–226
CAS PubMed Google Scholar
Baldani JI, Baldani VL (2005) History on the biological nitrogen fixation research in graminaceous plants: special emphasis on the Brazilian experience. An Acad Bras Cienc 77:549–579
CAS PubMed Google Scholar
Banci L (2003) Molecular dynamics simulations of metalloproteins. Curr Opin Chem Biol 7:143–149
CAS PubMed Google Scholar
Banci L, Comba P (1997) Molecular modeling and dynamics of bioinorganic systems. Kluwer Academic, Dordrecht
Google Scholar
Batut J, Santero E, Kustu S (1991) In vitro activity of the nitrogen fixation regulatory protein FIXJ from Rhizobium meliloti. J Bacteriol 173:5914–5917
CAS PubMed Central PubMed Google Scholar
Becker A et al (2009) A portal for rhizobial genomes: RhizoGATE integrates a Sinorhizobium meliloti genome annotation update with postgenome data. J Biotechnol 140:45–50
CAS PubMed Central PubMed Google Scholar
Bell CJ et al (2001) The Medicago Genome Initiative: a model legume database. Nucleic Acids Res 29:114–117
CAS PubMed Central PubMed Google Scholar
Benson DR, Silvester WB (1993) Biology of Frankia strains, actinomycete symbionts of actinorhizal plants. Microbiol Rev 57:293–319
CAS PubMed Central PubMed Google Scholar
Berges H et al (2003) Development of Sinorhizobium meliloti pilot macroarrays for transcriptome analysis. Appl Environ Microbiol 69:1214–1219
CAS PubMed Central PubMed Google Scholar
Berman-Frank I, Lundgren P, Falkowski P (2003) Nitrogen fixation and photosynthetic oxygen evolution in cyanobacteria. Res Microbiol 154:157–164
CAS PubMed Google Scholar
Berry AM, Harriott OT, Moreau RA, Osman SF, Benson DR, Jones AD (1993) Hopanoid lipids compose the Frankia vesicle envelope, presumptive barrier of oxygen diffusion to nitrogenase. Proc Natl Acad Sci U S A 90:6091–6094
CAS PubMed Central PubMed Google Scholar
Black M et al (2012) The genetics of symbiotic nitrogen fixation: comparative genomics of 14 rhizobia strains by resolution of protein clusters. Genes 3:138–166
PubMed Central PubMed Google Scholar
Bohlool BB, Ladha JK, Garrity DP, George T (1992) Biological nitrogen fixation for sustainable agriculture: a perspective. Plant and Soil 141:1–11
CAS Google Scholar
Breitling R, Laubner D, Adamski J (2001) Structure-based phylogenetic analysis of short-chain alcohol dehydrogenases and reclassification of the 17beta-hydroxysteroid dehydrogenase family. Mol Biol Evol 18:2154–2161
CAS PubMed Google Scholar
Brigle KE, Weiss MC, Newton WE, Dean DR (1987) Products of the iron-molybdenum cofactor-specific biosynthetic genes, nifE and nifN, are structurally homologous to the products of the nitrogenase molybdenum-iron protein genes, nifD and nifK. J Bacteriol 169:1547–1553
CAS PubMed Central PubMed Google Scholar
Brooks B, Karplus M (1983) Harmonic dynamics of proteins: normal modes and fluctuations in bovine pancreatic trypsin inhibitor. Proc Natl Acad Sci U S A 80:6571–6575
CAS PubMed Central PubMed Google Scholar
Browne WJ, North AC, Phillips DC, Brew K, Vanaman TC, Hill RL (1969) A possible three-dimensional structure of bovine alpha-lactalbumin based on that of hen’s egg-white lysozyme. J Mol Biol 42:65–86
CAS PubMed Google Scholar
Burgess BK, Lowe DJ (1996) Mechanism of molybdenum nitrogenase. Chem Rev 96:2983–3012
CAS PubMed Google Scholar
Burke DH, Hearst JE, Sidow A (1993) Early evolution of photosynthesis: clues from nitrogenase and chlorophyll iron proteins. Proc Natl Acad Sci U S A 90:7134–7138
CAS PubMed Central PubMed Google Scholar
Burns RC, Hardy RW (1975) Nitrogen fixation in bacteria and higher plants. Mol Biol Biochem Biophys 21:1–189
Google Scholar
Cantera JJ, Kawasaki H, Seki T (2004) The nitrogen-fixing gene (nifH) of Rhodopseudomonas palustris: a case of lateral gene transfer? Microbiology 150:2237–2246
CAS PubMed Google Scholar
Carvalho FM, Souza RC, Barcellos FG, Hungria M, Vasconcelos AT (2010) Genomic and evolutionary comparisons of diazotrophic and pathogenic bacteria of the order Rhizobiales. BMC Microbiol 10:37
PubMed Central PubMed Google Scholar
Case DA, Karplus M (1979) Dynamics of ligand binding to heme proteins. J Mol Biol 132:343–368
CAS PubMed Google Scholar
Ceremonie H, Cournoyer B, Maillet F, Normand P, Fernandez MP (1998) Genetic complementation of rhizobial nod mutants with Frankia DNA: artifact or reality? Mol Gen Genet (MGG) 260:115–119
CAS Google Scholar
Chang G, Roth CB, Reyes CL, Pornillos O, Chen YJ, Chen AP (2006) Retraction. Science 314:1875
CAS PubMed Google Scholar
Chothia C et al (1986) The predicted structure of immunoglobulin D1.3 and its comparison with the crystal structure. Science 233:755–758
CAS PubMed Google Scholar
Comba P, Remenyi R (2002) A new molecular mechanics force field for the oxidized form of blue copper proteins. J Comput Chem 23:697–705
CAS PubMed Google Scholar
Crossman LC et al (2008) A common genomic framework for a diverse assembly of plasmids in the symbiotic nitrogen fixing bacteria. PLoS One 3:e2567
PubMed Central PubMed Google Scholar
D’Auria G, Jimenez-Hernandez N, Peris-Bondia F, Moya A, Latorre A (2010) Legionella pneumophila pangenome reveals strain-specific virulence factors. BMC Genomics 11:181
Google Scholar
Dawson JO (1983) Dinitrogen fixation in forest ecosystems. Can J Microbiol/(Revue Canadienne de Microbiologie) 29:979–992
Google Scholar
Dedysh SN, Ricke P, Liesack W (2004) NifH and NifD phylogenies: an evolutionary basis for understanding nitrogen fixation capabilities of methanotrophic bacteria. Microbiology 150:1301–1313
CAS PubMed Google Scholar
Delgado MJ, Bedmar EJ, Downie JA (1998) Genes involved in the formation and assembly of rhizobial cytochromes and their role in symbiotic nitrogen fixation. Adv Microb Physiol 40:191–231
CAS PubMed Google Scholar
Dermitzakis ET, Clark AG (2001) Differential selection after duplication in mammalian developmental genes. Mol Biol Evol 18:557–562
CAS PubMed Google Scholar
Dixon R et al (1980) Analysis of regulation of Klebsiella pneumoniae nitrogen fixation (nif) gene cluster with gene fusions. Nature 286:128–132
CAS PubMed Google Scholar
Dobereiner J (1976) Plant genotype effects on nitrogen fixation in grasses. Basic Life Sci 8:325–334
CAS PubMed Google Scholar
dos Santos ES, Gritta DS, Taft CA, Almeida PF, Ramos-de-Souza E (2009) Molecular dynamics simulation of the adenylylsulphate reductase from hyperthermophilic Archaeoglobus fulgidus. Mol Simulat 36:199–203
Google Scholar
Dos Santos PC, Fang Z, Mason SW, Setubal JC, Dixon R (2012) Distribution of nitrogen fixation and nitrogenase-like sequences amongst microbial genomes. BMC Genomics 13:162
CAS PubMed Central PubMed Google Scholar
Eady RR (1996) Structure-function relationships of alternative nitrogenases. Chem Rev 96:3013–3030
CAS PubMed Google Scholar
Eady RR, Postgate JR (1974) Nitrogenase. Nature 249:805–810
CAS PubMed Google Scholar
Elmerich C (1984) Molecular biology and ecology of diazotrophs associated with non-leguminous plants. Nat Biotechnol 2:967–978
CAS Google Scholar
Fani R, Gallo R, Lio P (2000) Molecular evolution of nitrogen fixation: the evolutionary history of the nifD, nifK, nifE, and nifN genes. J Mol Evol 51:1–11
CAS PubMed Google Scholar
Fischer HM (1994) Genetic regulation of nitrogen fixation in rhizobia. Microbiol Rev 58:352–386
CAS PubMed Central PubMed Google Scholar
Fox T, Kollman PA (1998) Application of the RESP methodology in the parametrization of organic solvents. J Phys Chem B 102:8070–8079
CAS Google Scholar
Franche C, Lindstrom K, Elmerich C (2009) Nitrogen-fixing bacteria associated with leguminous and non-leguminous plants. Plant and Soil 321:35–59
CAS Google Scholar
Gaby JC, Buckley DH (2012) A comprehensive evaluation of PCR primers to amplify the nifH gene of nitrogenase. PLoS One 7:e42149
CAS PubMed Central PubMed Google Scholar
Galardini M et al (2011) Exploring the symbiotic pangenome of the nitrogen-fixing bacterium Sinorhizobium meliloti. BMC Genomics 12:235
Google Scholar
Garau G, Di Guilmi AM, Hall BG (2005) Structure-based phylogeny of the metallo-beta-lactamases. Antimicrob Agents Chemother 49:2778–2784
CAS PubMed Central PubMed Google Scholar
Gasteiger E, Gattiker A, Hoogland C, Ivanyi I, Appel RD, Bairoch A (2003) ExPASy: the proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res 31:3784–3788
CAS PubMed Central PubMed Google Scholar
Georgiadis MM, Komiya H, Chakrabarti P, Woo D, Kornuc JJ, Rees DC (1992) Crystallographic structure of the nitrogenase iron protein from Azotobacter vinelandii. Science 257:1653–1659
CAS PubMed Google Scholar
Geurts R, Bisseling T (2002) Rhizobium Nod factor perception and signalling. Plant Cell Online 14:S239–S249
CAS Google Scholar
Gonzalez V et al (2006) The partitioned Rhizobium etli genome: genetic and metabolic redundancy in seven interacting replicons. Proc Natl Acad Sci U S A 103:3834–3839
PubMed Central PubMed Google Scholar
Gribaldo S, Casane D, Lopez P, Philippe H (2003) Functional divergence prediction from evolutionary analysis: a case study of vertebrate hemoglobin. Mol Biol Evol 20:1754–1759
CAS PubMed Google Scholar
Grottesi A, Ceruso MA, Colosimo A, Di Nola A (2002) Molecular dynamics study of a hyperthermophilic and a mesophilic rubredoxin. Proteins 46:287–294
CAS PubMed Google Scholar
Gtari M, Ghodhbane-Gtari F, Nouioui I, Beauchemin N, Tisa LS (2012) Phylogenetic perspectives of nitrogen-fixing actinobacteria. Arch Microbiol 194:3–11
CAS PubMed Google Scholar
Gu X (1999) Statistical methods for testing functional divergence after gene duplication. Mol Biol Evol 16:1664–1674
CAS PubMed Google Scholar
Guldner E, Desmarais E, Galtier N, Godelle B (2004) Molecular evolution of plant haemoglobin: two haemoglobin genes in nymphaeaceae Euryale ferox. J Evol Biol 17:48–54
CAS PubMed Google Scholar
Hartmann LS, Barnum SR (2010) Inferring the evolutionary history of Mo-dependent nitrogen fixation from phylogenetic studies of nifK and nifDK. J Mol Evol 71:70–85
CAS PubMed Google Scholar
Haukka K, Lindstrom K, Young JP (1998) Three phylogenetic groups of nodA and nifH genes in Sinorhizobium and Mesorhizobium isolates from leguminous trees growing in Africa and Latin America. Appl Environ Microbiol 64:419–426
CAS PubMed Central PubMed Google Scholar
Hennecke H, Kaluza K, Thöny B, Fuhrmann M, Ludwig W, Stackebrandt E (1985) Concurrent evolution of nitrogenase genes and 16S rRNA in Rhizobium species and other nitrogen fixing bacteria. Arch Microbiol 142:342–348
CAS Google Scholar
Henson BJ, Hesselbrock SM, Watson LE, Barnum SR (2004a) Molecular phylogeny of the heterocystous cyanobacteria (subsections IV and V) based on nifD. Int J Syst Evol Microbiol 54:493–497
CAS PubMed Google Scholar
Henson BJ, Watson LE, Barnum SR (2004b) The evolutionary history of nitrogen fixation, as assessed by NifD. J Mol Evol 58:390–399
CAS PubMed Google Scholar
Hiller NL et al (2007) Comparative genomic analyses of seventeen Streptococcus pneumoniae strains: insights into the pneumococcal supragenome. J Bacteriol 189:8186–8195
CAS PubMed Central PubMed Google Scholar
Hiltner L (1898) Über Entstehung und physiologische Bedeutung der Wurzelknöllchen. Forst Naturwiss Z 7:415–423
CAS Google Scholar
Hirsch AM, McKhann HI, Reddy A, Liao J, Fang Y, Marshall CR (1995) Assessing horizontal transfer of nifHDK genes in eubacteria: nucleotide sequence of nifK from Frankia strain HFPCcI3. Mol Biol Evol 12:16–27
CAS PubMed Google Scholar
Hoffman BM, Dean DR, Seefeldt LC (2009) Climbing nitrogenase: toward a mechanism of enzymatic nitrogen fixation. Acc Chem Res 42:609–619
CAS PubMed Central PubMed Google Scholar
Hogeweg P (2011) The roots of bioinformatics in theoretical biology. PLoS Comput Biol 7:e1002021
CAS PubMed Central PubMed Google Scholar
Hogg JS et al (2007) Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains. Genome Biol 8:R103
PubMed Central PubMed Google Scholar
Hubbard TJ, Blundell TL (1987) Comparison of solvent-inaccessible cores of homologous proteins: definitions useful for protein modelling. Protein Eng 1:159–171
CAS PubMed Google Scholar
Hurek T, Egener T, Reinhold-Hurek B (1997) Divergence in nitrogenases of Azoarcus spp., Proteobacteria of the beta subclass. J Bacteriol 179:4172–4178
CAS PubMed Central PubMed Google Scholar
Huss-Danell K (1997) Actinorhizal symbioses and their N2 fixation. New Phytol 136:375–405
CAS Google Scholar
Ikemura T (1985) Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol 2:13–34
CAS PubMed Google Scholar
Imperial J, Ugalde RA, Shah VK, Brill WJ (1984) Role of the nifQ gene product in the incorporation of molybdenum into nitrogenase in Klebsiella pneumoniae. J Bacteriol 158:187–194
CAS PubMed Central PubMed Google Scholar
Irisarri P, Gonnet S, Monza J (2001) Cyanobacteria in Uruguayan rice fields: diversity, nitrogen fixing ability and tolerance to herbicides and combined nitrogen. J Biotechnol 91:95–103
CAS PubMed Google Scholar
Johnson MS, Srinivasan N, Sowdhamini R, Blundell TL (1994) Knowledge-based protein modeling. Crit Rev Biochem Mol Biol 29:1–68
CAS PubMed Google Scholar
Kaneko T et al (2000) Complete genome structure of the nitrogen-fixing symbiotic bacterium Mesorhizobium loti (supplement). DNA Res 7:381–406
CAS PubMed Google Scholar
Kessler PS, McLarnan J, Leigh JA (1997) Nitrogenase phylogeny and the molybdenum dependence of nitrogen fixation in Methanococcus maripaludis. J Bacteriol 179:541–543
CAS PubMed Central PubMed Google Scholar
Kim J, Rees DC (1992) Structural models for the metal centers in the nitrogenase molybdenum-iron protein. Science 257:1677–1682
CAS PubMed Google Scholar
Kim J, Woo D, Rees DC (1993) X-ray crystal structure of the nitrogenase molybdenum-iron protein from Clostridium pasteurianum at 3.0-A resolution. Biochemistry 32:7104–7115
CAS PubMed Google Scholar
Klipp W (2004) Genetics and regulation of nitrogen fixation in free-living bacteria. Kluwer Academic, Dordrecht/Boston
Google Scholar
Kuhn B, Jacobsen W, Christians U, Benet LZ, Kollman PA (2001) Metabolism of sirolimus and its derivative everolimus by cytochrome P450 3A4: insights from docking, molecular dynamics, and quantum chemical calculations. J Med Chem 44:2027–2034
CAS PubMed Google Scholar
Latysheva N, Junker VL, Palmer WJ, Codd GA, Barker D (2012) The evolution of nitrogen fixation in cyanobacteria. Bioinformatics 28(5):603–606
Google Scholar
Leigh GJ (2002) Nitrogen fixation at the millennium. Elsevier, Amsterdam/London
Google Scholar
Lery LM, Bitar M, Costa MG, Rossle SC, Bisch PM (2010) Unraveling the molecular mechanisms of nitrogenase conformational protection against oxygen in diazotrophic bacteria. BMC Genomics 11(Suppl 5):S7
CAS PubMed Central PubMed Google Scholar
Levitt M, Sharon R (1988) Accurate simulation of protein dynamics in solution. Proc Natl Acad Sci U S A 85:7557–7561
CAS PubMed Central PubMed Google Scholar
Li W, Liu B, Yu L, Feng D, Wang H, Wang J (2009a) Phylogenetic analysis, structural evolution and functional divergence of the 12-oxo-phytodienoate acid reductase gene family in plants. BMC Evol Biol 9:90
PubMed Central PubMed Google Scholar
Li YD et al (2009b) The rapid evolution of signal peptides is mainly caused by relaxed selection on non-synonymous and synonymous sites. Gene 436:8–11
CAS PubMed Google Scholar
Long SR (1989) Rhizobium-legume nodulation: life together in the underground. Cell 56:203–214
CAS PubMed Google Scholar
Long SR (2001) Genes and signals in the Rhizobium-legume symbiosis. Plant Physiol 125:69–72
CAS PubMed Central PubMed Google Scholar
MacLean AM, Finan TM, Sadowsky MJ (2007) Genomes of the symbiotic nitrogen-fixing of Bacteria legumes. Plant Physiol 144:615–622
CAS PubMed Central PubMed Google Scholar
Mahe F, Markova D, Pasquet R, Misset MT, Ainouche A (2011) Isolation, phylogeny and evolution of the SymRK gene in the legume genus Lupinus L. Mol Phylogenet Evol 60:49–61
PubMed Google Scholar
Mao C, Qiu J, Wang C, Charles TC, Sobral BW (2005) NodMutDB: a database for genes and mutants involved in symbiosis. Bioinformatics 21:2927–2929
CAS PubMed Google Scholar
Marti-Renom MA, Stuart AC, Fiser A, Sanchez R, Melo F, Sali A (2000) Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct 29:291–325
CAS PubMed Google Scholar
Mathur M, Tuli R (1991) Analysis of codon usage in genes for nitrogen fixation from phylogenetically diverse diazotrophs. J Mol Evol 32:364–373
CAS Google Scholar
Mayer SM, Lawson DM, Gormal CA, Roe SM, Smith BE (1999) New insights into structure-function relationships in nitrogenase: a 1.6 A resolution X-ray crystallographic study of Klebsiella pneumoniae MoFe-protein. J Mol Biol 292:871–891
CAS PubMed Google Scholar
McCammon JA, Gelin BR, Karplus M (1977) Dynamics of folded proteins. Nature 267:585–590
CAS PubMed Google Scholar
Medini D, Donati C, Tettelin H, Masignani V, Rappuoli R (2005) The microbial pan-genome. Curr Opin Genet Dev 15:589–594
CAS PubMed Google Scholar
Meuwly M, Karplus M (2004) Theoretical investigations on Azotobacter vinelandii ferredoxin I: effects of electron transfer on protein dynamics. Biophys J 86:1987–2007
CAS PubMed Central PubMed Google Scholar
Milenkovic VM, Brockmann M, Stohr H, Weber BH, Strauss O (2010) Evolution and functional divergence of the anoctamin family of membrane proteins. BMC Evol Biol 10:319
PubMed Central PubMed Google Scholar
Mondal UK, Das B, Ghosh TC, Sen A, Bothra AK (2008) Nucleotide triplet based molecular phylogeny of class I and class II aminoacyl t-RNA synthetase in three domain of life process: bacteria, archaea, and eukarya. J Biomol Struct Dyn 26:321–328
CAS PubMed Google Scholar
Mondal UK, Sen A, Bothra AK (2011) Characterization of pathogenic genes through condensed matrix method, case study through bacterial Zeta toxin. Int J Genet Eng Biotechnol 2:109–114
Google Scholar
Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540
CAS PubMed Google Scholar
Mylona P, Pawlowski K, Bisseling T (1995) Symbiotic nitrogen fixation. Plant Cell 7:869–885
CAS PubMed Central PubMed Google Scholar
Nadler SA (1995) Advantages and disadvantages of molecular phylogenetics: a case study of ascaridoid nematodes. J Nematol 27:423–432
CAS PubMed Central PubMed Google Scholar
Normand P, Bousquet J (1989) Phylogeny of nitrogenase sequences in Frankia and other nitrogen-fixing microorganisms. J Mol Evol 29:436–447
CAS PubMed Google Scholar
Normand P, Gouy M, Cournoyer B, Simonet P (1992) Nucleotide sequence of nifD from Frankia alni strain ArI3: phylogenetic inferences. Mol Biol Evol 9:495–506
CAS PubMed Google Scholar
Normand P et al (2007) Genome characteristics of facultatively symbiotic Frankia sp. strains reflect host range and host plant biogeography. Genome Res 17:7–15
PubMed Central PubMed Google Scholar
Norrby PO, Brandt P (2001) Deriving force field parameters for coordination complexes. Coord Chem Rev 212:79–109
CAS Google Scholar
O’Carroll IP, Dos Santos PC (2011) Genomic analysis of nitrogen fixation. Methods Mol Biol 766:49–65
PubMed Google Scholar
Parker MA, Lafay B, Burdon JJ, van Berkum P (2002) Conflicting phylogeographic patterns in rRNA and nifD indicate regionally restricted gene transfer in Bradyrhizobium. Microbiology 148:2557–2565
CAS PubMed Google Scholar
Paul W, Merrick M (1989) The roles of the nifW, nifZ and nifM genes of Klebsiella pneumoniae in nitrogenase biosynthesis. Eur J Biochem 178:675–682
CAS PubMed Google Scholar
Pawlowski K (2009) Prokaryotic symbionts in plants. Springer, Berlin/Heidelberg
Google Scholar
Pazos F, Valencia A (2001) Similarity of phylogenetic trees as indicator of protein-protein interaction. Protein Eng 14:609–614
CAS PubMed Google Scholar
Peden JF (1999) Analysis of codon usage, PhD thesis, University of Nottingham
Google Scholar
Pedrosa FO, Hungria M, Yates G, Newton WE (eds) (2000) Nitrogen fixation: from molecules to crop productivity. Kluwer Academic Publishers, Dordrecht
Google Scholar
Peoples MB, Craswell ET (1992) Biological nitrogen fixation: investments, expectations and actual contributions to agriculture. Plant and Soil 141:13–39
CAS Google Scholar
Peoples MB, Herridge DF, Ladha JK (1995) Biological nitrogen fixation: an efficient source of nitrogen for sustainable agricultural production? Plant and Soil 174:3–28
CAS Google Scholar
Peralta H, Guerrero G, Aguilar A, Mora J (2011) Sequence variability of Rhizobiales orthologs and relationship with physico-chemical characteristics of proteins. Biol Direct 6:48
CAS PubMed Central PubMed Google Scholar
Perret X, Freiberg C, Rosenthal A, Broughton WJ, Fellay R (1999) High-resolution transcriptional analysis of the symbiotic plasmid of Rhizobium sp. NGR234. Mol Microbiol 32:415–425
CAS PubMed Google Scholar
Perret X, Staehelin C, Broughton WJ (2000) Molecular basis of symbiotic promiscuity. Microbiol Mol Biol Rev 64:180–201
CAS PubMed Central PubMed Google Scholar
Peters JW, Fisher K, Dean DR (1995) Nitrogenase structure and function: a biochemical-genetic perspective. Annu Rev Microbiol 49:335–366
CAS PubMed Google Scholar
Pommer EH (1959) Ueber die Isolierung des Endophyten aus den Wurzelknollchen Alnus glutinosa Gaertn. und iiber erfolgreiche Re-infektionsversuche. Ber Dtsch Bot Ges 72:138–150
Google Scholar
Postgate JR (1982) The fundamentals of nitrogen fixation. Cambridge University Press, Cambridge
Google Scholar
Postgate JR (1998) Nitrogen fixation, 3rd edn. Cambridge University Press, Cambridge/New York
Google Scholar
Postgate JR, Eady RR (1988) The evolution of biological nitrogen fixation. In: Bother H et al (eds) Nitrogen fixation: hundred years after. Proceedings of the 7th international congress on nitrogen fixation. Gustav Fischer, Stuttgart, pp 31–40
Google Scholar
Puhler A, Arlat M, Becker A, Göttfert M, Morrissey JP, O’Gara F (2004) What can bacterial genome research teach us about bacteria-plant interactions? Curr Opin Plant Biol 7:137–147
PubMed Google Scholar
Qi J, Wang B, Hao BI (2004) Whole proteome prokaryote phylogeny without sequence alignment: a K-string composition approach. J Mol Evol 58:1–11
CAS PubMed Google Scholar
Qian J, Kwon SW, Parker MA (2003) rRNA and nifD phylogeny of Bradyrhizobium from sites across the Pacific Basin. FEMS Microbiol Lett 219:159–165
CAS PubMed Google Scholar
Quackenbush J, Liang F, Holt I, Pertea G, Upton J (2000) The TIGR gene indices: reconstruction and representation of expressed gene sequences. Nucleic Acids Res 28:141–145
CAS PubMed Central PubMed Google Scholar
Raes J, Van de Peer Y (2003) Gene duplication, the evolution of novel gene functions, and detecting functional divergence of duplicates in silico. Appl Bioinformatics 2:91–101
CAS PubMed Google Scholar
Rai AN, Bergman B, Rasmussen U (2002) Cyanobacteria in symbiosis. Kluwer Academic, Dordrecht/Boston
Google Scholar
Ramseier TM, Gottfert M (1991) Codon usage and G + C content in Bradyrhizobium japonicum genes are not uniform. Arch Microbiol 156:270–276
CAS PubMed Google Scholar
Randic M, Guo X, Basak SC (2001) On the characterization of DNA primary sequences by triplet of nucleic acid bases. J Chem Inf Comput Sci 41:619–626
CAS PubMed Google Scholar
Rasko DA et al (2008) The pangenome structure of Escherichia coli: comparative genomic analysis of E. coli commensal and pathogenic isolates. J Bacteriol 190:6881–6893
CAS PubMed Central PubMed Google Scholar
Raymond J, Siefert JL, Staples CR, Blankenship RE (2004) The natural history of nitrogen fixation. Mol Biol Evol 21:541–554
CAS PubMed Google Scholar
Rees DC, Howard JB (2000) Nitrogenase: standing at the crossroads. Curr Opin Chem Biol 4:559–566
CAS PubMed Google Scholar
Rees DC et al (2005) Structural basis of biological nitrogen fixation. Philos Transact A Math Phys Eng Sci 363:971–984, discussion 1035–1040
CAS Google Scholar
Roberts GP, MacNeil T, MacNeil D, Brill WJ (1978) Regulation and characterization of protein products coded by the nif (nitrogen fixation) genes of Klebsiella pneumoniae. J Bacteriol 136:267–279
CAS PubMed Central PubMed Google Scholar
Robson RL, Eady RR, Richardson TH, Miller RW, Hawkins M, Postgate JR (1986) The alternative nitrogenase of Azotobacter chroococcum is a vanadium enzyme. Nature 322:388–390
CAS Google Scholar
Rubio LM, Ludden PW (2005) Maturation of nitrogenase: a biochemical puzzle. J Bacteriol 187:405–414
CAS PubMed Central PubMed Google Scholar
Rubio LM, Ludden PW (2008) Biosynthesis of the iron-molybdenum cofactor of nitrogenase. Annu Rev Microbiol 62:93–111
CAS PubMed Google Scholar
Russell RB, Sternberg MJ (1996) A novel binding site in catalase is suggested by structural similarity to the calycin superfamily. Protein Eng 9:107–111
CAS PubMed Google Scholar
Sali A (1995) Modeling mutations and homologous proteins. Curr Opin Biotechnol 6:437–451
CAS PubMed Google Scholar
Schlegel HG, Zaborosch C (2003) General microbiology. Cambridge University Press, Cambridge
Google Scholar
Schuler GD (1997) Sequence mapping by electronic PCR. Genome Res 7:541–550
CAS PubMed Central PubMed Google Scholar
Schultze M, Kondorosi A (1998) Regulation of symbiotic root nodule development. Annu Rev Genet 32:33–57
CAS PubMed Google Scholar
Sen S, Peters JW (2006) The thermal adaptation of the nitrogenase Fe protein from thermophilic Methanobacter thermoautotrophicus. Proteins 62:450–460
CAS PubMed Google Scholar
Sen A, Sur S, Bothra AK, Benson DR, Normand P, Tisa LS (2008) The implication of life style on codon usage patterns and predicted highly expressed genes for three Frankia genomes. Antonie Van Leeuwenhoek 93:335–346
CAS PubMed Google Scholar
Sen A, Sur S, Tisa L, Bothra A, Thakur S, Mondal U (2010) Homology modelling of the Frankia nitrogenase iron protein. Symbiosis 50:37–44
CAS Google Scholar
Sen A, Thakur S, Bothra AK, Sur S, Tisa LS (2012) Identification of TTA codon containing genes in Frankia and exploration of the role of tRNA in regulating these genes. Arch Microbiol 194:35–45
CAS PubMed Google Scholar
Sharp PM, Li WH (1987) The rate of synonymous substitution in enterobacterial genes is inversely related to codon usage bias. Mol Biol Evol 4:222–230
CAS PubMed Google Scholar
Sharp PM, Tuohy TM, Mosurski KR (1986) Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res 14:5125–5143
CAS PubMed Central PubMed Google Scholar
Silver WS, Postgate JR (1973) Evolution of asymbiotic nitrogen fixation. J Theor Biol 40:1–10
CAS PubMed Google Scholar
Sims GE, Jun SR, Wu GA, Kim SH (2009) Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proc Natl Acad Sci U S A 106:2677–2682
CAS PubMed Central PubMed Google Scholar
Smit AM, Strabala TJ, Peng L, Rawson P, Lloyd-Jones G, Jordan TW (2012) Proteomic phenotyping of Novosphingobium nitrogenifigens reveals a robust capacity for simultaneous nitrogen fixation, polyhydroxyalkanoate production, and resistance to reactive oxygen species. Appl Environ Microbiol 78:4802–4815
CAS PubMed Central PubMed Google Scholar
Snipen L, Ussery DW (2010) Standard operating procedure for computing pangenome trees. Stand Genomic Sci 2:135–141
PubMed Central PubMed Google Scholar
Stewart WD (1969) Biological and ecological aspects of nitrogen fixation by free-living micro-organisms. Proc R Soc Lond B Biol Sci 172:367–388
CAS PubMed Google Scholar
Sundaram S, Tripathi A, Gupta V (2010) Structure prediction and molecular simulation of gases diffusion pathways in hydrogenase. Bioinformation 5:177–183
PubMed Central PubMed Google Scholar
Sur S, Pal A, Bothra AK, Sen A (2005) Moderate codon bias attributed to translational selection in nitrogen fixing genes of Bradyrhizobium japonicum USDA110. Bioinformatics Ind 3:59–64
CAS Google Scholar
Sur S, Bhattacharya M, Bothra AK, Tisa LS, Sen A (2008) Bioinformatic analysis of codon usage patterns in a free-living diazotroph, Azotobacter vinelandii. Biotechnology 7:242–249
CAS Google Scholar
Sur S, Bothra AK, Ghosh TC, Sen A (2010) Investigation of the molecular evolution of nitrogen fixation using nucleotide triplet based condensed matrix method. Int J Integrative Biol 10:29–65
Google Scholar
Tatusov RL, Galperin MY, Natale DA, Koonin EV (2000) The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res 28:33–36
CAS PubMed Central PubMed Google Scholar
Tettelin H et al (2005) Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”. Proc Natl Acad Sci U S A 102:13950–13955
CAS PubMed Central PubMed Google Scholar
Thakur S, Bothra AK, Sen A (2012) In silico studies of NifH protein structure and its post-translational modification in Bradyrhizobium sp. ORS278. Int J Pharm Bio Sci 3:B22–B32
Google Scholar
Thorneley RNF, Lowe DJ (1985) Kinetics and mechanism of the nitrogenase enzyme system. In: Spiro TG (ed) Molybdenum enzymes. Wiley, New York, pp 221–284
Google Scholar
Torrey JG, Tjepkema JD (1979) Symbiotic nitrogen fixation in actinomycete-nodulated plants: Preface. Bot Gaz 140(suppl):i–ii
Google Scholar
Von Bulow JF, Dobereiner J (1975) Potential for nitrogen fixation in maize genotypes in Brazil. Proc Natl Acad Sci U S A 72:2389–2393
Google Scholar
Wani SP, Rupela OP, Lee KK (1995) Sustainable agriculture in the semi-arid tropics through biological nitrogen fixation in grain legumes. Plant and Soil 174:29–49
CAS Google Scholar
Watanabe I, Liu CC (1992) Improving nitrogen-fixing systems and integrating them into sustainable rice farming. Plant and Soil 141:57–67
CAS Google Scholar
Wright F (1990) The ‘effective number of codons’ used in a gene. Gene 87:23–29
CAS PubMed Google Scholar
Xie J et al (2011) GeoChip-based analysis of the functional gene diversity and metabolic potential of microbial communities in acid mine drainage. Appl Environ Microbiol 77:991–999
CAS PubMed Central PubMed Google Scholar
Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13:555–556
CAS PubMed Google Scholar
Yang GP et al (1999) Structure of the Mesorhizobium huakuii and Rhizobium galegae Nod factors: a cluster of phylogenetically related legumes are nodulated by rhizobia producing Nod factors with alpha, beta-unsaturated N-acyl substitutions. Mol Microbiol 34:227–237
CAS PubMed Google Scholar
Yi J (2009) The Medicago truncatula genome and analysis of nodule-specific genes, PhD thesis, The University of Oklahoma
Google Scholar
Young JPW (1992) Phylogenetic classification of nitrogen-fixing organisms. In: Stacey G, Burris RH, Evans HJ (eds) Biological nitrogen fixation. Chapman & Hall, New York, pp 43–86
Google Scholar
Young J (2005) The phylogeny and evolution of nitrogenases. In: Palacios R, Newton WE (eds) Genomes and genomics of nitrogen-fixing organisms. Springer, Dordrecht, pp 221–241
Google Scholar
Young JP et al (2006) The genome of Rhizobium leguminosarum has recognizable core and accessory components. Genome Biol 7:R34
PubMed Central PubMed Google Scholar
Zehr JP, Harris D, Dominic B, Salerno J (1997a) Structural analysis of the Trichodesmium nitrogenase iron protein: implications for aerobic nitrogen fixation activity. FEMS Microbiol Lett 153:303–309
CAS PubMed Google Scholar
Zehr JP, Mellon MT, Hiorns WD (1997b) Phylogeny of cyanobacterial nifH genes: evolutionary implications and potential applications to natural assemblages. Microbiology 143(Pt 4):1443–1450
CAS PubMed Google Scholar
Zhao H, Li M, Fang K, Chen W, Wang J (2012) In Silico insights into the symbiotic nitrogen fixation in Sinorhizobium meliloti via metabolic reconstruction. PLoS One 7:e31287
CAS PubMed Central PubMed Google Scholar
Zheng L, White RH, Dean DR (1997) Purification of the Azotobacter vinelandii nifV-encoded homocitrate synthase. J Bacteriol 179:5963–5966
CAS PubMed Central PubMed Google Scholar
Zheng Y, Xu D, Gu X (2007) Functional divergence after gene duplication and sequence-structure relationship: a case study of G-protein alpha subunits. J Exp Zool B Mol Dev Evol 308:85–96
PubMed Google Scholar

Download references

Acknowledgments

AS is grateful to the DBT, Government of India, for providing CREST Award and financial help in setting up Bioinformatics Centre, in the Department of Botany, University of North Bengal. ST acknowledges CSIR for CSIR-SRF Fellowship.

Author information

Authors and Affiliations

NBU Bioinformatics Facility, Department of Botany, University of North Bengal, Siliguri, 734013, India
Subarna Thakur & Arnab Sen
Bioinformatics Chemoinformatics Laboratory, Department of Chemistry, Raiganj College, Raiganj, India
Asim K. Bothra

Authors

Subarna Thakur
View author publications
You can also search for this author in PubMed Google Scholar
Asim K. Bothra
View author publications
You can also search for this author in PubMed Google Scholar
Arnab Sen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arnab Sen .

Editor information

Editors and Affiliations

Department of Genetics, Osmania University, Hyderabad, Andhra Pradesh, India
Kavi Kishor P.B.
Department of Biotechnology, Birla Institute of Technology, Ranchi, Jharkhand, India
Rajib Bandopadhyay
Bioclues Organization, IKP Knowledge Park, Picket, Secunderabad, Andhra Pradesh, India
Prashanth Suravajhala

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Thakur, S., Bothra, A.K., Sen, A. (2014). Exploring the Genomes of Symbiotic Diazotrophs with Relevance to Biological Nitrogen Fixation. In: P.B., K., Bandopadhyay, R., Suravajhala, P. (eds) Agricultural Bioinformatics. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1880-7_14

Download citation

DOI: https://doi.org/10.1007/978-81-322-1880-7_14
Published: 30 May 2014
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1879-1
Online ISBN: 978-81-322-1880-7
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics

Exploring the Genomes of Symbiotic Diazotrophs with Relevance to Biological Nitrogen Fixation

Abstract

Similar content being viewed by others

Comparative Genomics Reveal the High Conservation and Scarce Distribution of Nitrogen Fixation nif Genes in the Plant-Associated Genus Herbaspirillum

Comparative Genomics of Nitrogen Cycling Pathways in Bacteria and Archaea

Distinctive characters of Nostoc genomes in cyanolichens

Keywords

1 Introduction

2 Various Aspects of Biological Nitrogen Fixation

2.1 Biological Nitrogen Fixation and Sustainable Agriculture

2.2 Physiological and Phylogenetic Diversity of Diazotrophs

2.3 Nitrogenase Complex: Enzymatic Machinery

2.4 Genetics and Genomics of Biological Nitrogen Fixation

3 The Application of Bioinformatics in BNF Research

3.1 Research Trends in Codon Usage Analysis and Comparative Genomics

3.2 Bioinformatics Approaches for the Characterization of Proteins Related to BNF

3.3 Tracing the Evolution of BNF Through Bioinformatics

3.3.1 Classical Approach

3.3.2 Alternative Approaches

4 Challenges and Future Prospects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Exploring the Genomes of Symbiotic Diazotrophs with Relevance to Biological Nitrogen Fixation

Abstract

Similar content being viewed by others

Comparative Genomics Reveal the High Conservation and Scarce Distribution of Nitrogen Fixation nif Genes in the Plant-Associated Genus Herbaspirillum

Comparative Genomics of Nitrogen Cycling Pathways in Bacteria and Archaea

Distinctive characters of Nostoc genomes in cyanolichens

Keywords

1 Introduction

2 Various Aspects of Biological Nitrogen Fixation

2.1 Biological Nitrogen Fixation and Sustainable Agriculture

2.2 Physiological and Phylogenetic Diversity of Diazotrophs

2.3 Nitrogenase Complex: Enzymatic Machinery

2.4 Genetics and Genomics of Biological Nitrogen Fixation

3 The Application of Bioinformatics in BNF Research

3.1 Research Trends in Codon Usage Analysis and Comparative Genomics

3.2 Bioinformatics Approaches for the Characterization of Proteins Related to BNF

3.3 Tracing the Evolution of BNF Through Bioinformatics

3.3.1 Classical Approach

3.3.2 Alternative Approaches

4 Challenges and Future Prospects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation