RNA World Hypothesis

An RNA World that predated the modern world of polypeptide and polynucleotide is a widely accepted model for the origin of life on earth (Bernhardt 2012; Cech 2009; Crick 1968; Gilbert 1986; Higgs and Lehman 2015; Neveu et al. 2013; Orgel 1968; Rich 1962; Robertson and Joyce 2012). The RNA World Hypothesis is actually a group of related models, with a variety of assumptions and definitions. In all variations of the RNA World Hypothesis, RNA enzymes (ribozymes) predate protein enzymes. Ribozymes performed a variety of catalytic functions in the RNA World, from metabolite biosynthesis to energy conversion (Fig. 1).

Fig. 1
figure 1

Timeline of the RNA World. In the RNA World Hypothesis, life on earth passed through a phase in which chemical transformations were catalyzed and regulated by RNA, and RNA-based genetic material was replicated by a ribozyme polymerase. The ribosome and other components of the translation system were absent from the first phase of Darwinian evolution. Biology underwent a Polymer Transition, and entered a second phase, adopting coded protein as the primary enzymatic biopolymer. The origins and evolution of the ribosome mark the boundary between the two limiting phases of biology. During and after the polymer transition, core ribozymes of the RNA World went extinct and were washed out of the phylogenetic record

The defining ribozyme of the RNA World, which unites all RNA World models, performed template-directed synthesis of RNA: in the RNA World, RNA self-replicated. Life on earth is biphasic under RNA World scenarios (Fig. 1). The origin and evolution of the ribosome marks the boundary between the two phases. RNA World models are attractive because they appear conceptually simple, facilitating specific predictions that can be tested in the laboratory or by data mining.

Support for an RNA World

The RNA World Hypothesis is consistent with the observed ability of RNA to both store genetic information (Ada and Perry 1954; Chao and Schachman 1956) and catalyze chemical reactions (Guerrier-Takada et al. 1983; Kruger et al. 1982). Although RNA in extant biology is seen to catalyze only RNA cutting and ligation along with peptidyl transfer (within the ribosome), a wide variety of chemical transformations can be catalyzed by ribozymes selected in vitro (Cech 2002; Hiller and Strobel 2011; Robertson and Ellington 2000; Sczepanski and Joyce 2014; Seelig and Jaschke 1999; Silverman and Begley 2007). Sustained experimental efforts have attempted to show that RNA is capable of self-replication (Attwater et al. 2013; Sczepanski and Joyce 2014; Shechner and Bartel 2011; Vaidya et al. 2012; Wochner et al. 2011). Lehman has shown that mixtures of RNA fragments that self-assemble into self-replicating RNAs can form cooperative catalytic cycles and networks (Vaidya et al. 2012). It has further been proposed that a simple protocell can encapsulate self-replicating RNAs (Szostak et al. 2001).

The catalytic competence of RNAs may have been greater on the early earth than on extant earth. For the first 1.5 billion years of life, RNA inhabited an anoxic earth with abundant Fe2+ (Anbar 2008; Hazen and Ferry 2010). Although Mg2+ is essential for extant RNA folding and catalysis, we hypothesized that Fe2+ was an RNA cofactor when iron was abundant and benign (minus O2), but was replaced by Mg2+ during a period known as the great oxidation—brought on by biological photosynthesis (Athavale et al. 2012; Hsiao et al. 2013a). We demonstrated that reversing this putative metal substitution in an anoxic environment, by replacing Mg2+ with Fe2+, expands the catalytic repertoire of RNA. Fe2+ confers catalytic function on ancient RNAs, including ribosomal RNAs (rRNAs).

Molecular Fossils

Ribozymes that catalyze fundamental reactions in extant biology are thought to be molecular fossils from an RNA World. The discovery that the ribosome is a ribozyme (Ban et al. 2000; Khaitovich et al. 1999) has been taken as support for the RNA World Hypothesis. Many critical processes of extant biology depend on small RNA precursors or derivatives. Benner and Ellington have argued (Benner et al. 1989) that the ubiquity and universality of these RNA cofactors is consistent with a molecular ‘‘palimpsest’’, in which an RNA World has been partially effaced by the modern biology of polynucleotide and polypeptide.

The Chicken and the Egg

The RNA World Hypothesis resolves the putative chicken and egg dilemma: which came first, polynucleotide or polypeptide? The simultaneous emergence from whole cloth of two functional biopolymers, one encoding the other, seems improbable. A single type of ancestral biopolymer (polynucleotide), performing multiple roles, appears to be characterized by high parsimony. A “Polymer Transition”, a progression of biology from one polymer type (polynucleotide) to two polymer types (polynucleotide and polypeptide), is consistent with an expectation that ancient biology transitioned from simple to complex.

The Polymer Transition

One essential element of a RNA World Hypothesis is a feasible pathway out of the RNA World, into the extant DNA/RNA/protein World. That is, biology presumably made a Polymer Transition from the RNA World to the current state of biopolymer co-dependence in which (i) polypeptide (protein) enzymes synthesize polynucleotide (RNA and DNA), (ii) polynucleotide enzymes (ribosomes) synthesize polypeptide, and (iii) the vast majority of chemical transformations are catalyzed and regulated by proteins. This transition must have followed the continuity principle, accomplished by numerous, manageable steps, each maintaining fitness. A widely cited proposal for the Polymer Transition was put forth by Poole (Jeffares et al. 1998; Poole et al. 1998); ribozyme-based biology gradually transitioned to extant biology as ribozymes incrementally relinquished catalytic function first to ribonucleoprotein enzymes, then to protein-based enzymes that lack RNA components entirely. These transitions were presumably driven by the general catalytic superiority of proteins over RNA. Cech (2009) has argued for a subtle variation of the Poole model in which early ribozymes interacted with available amino acids and peptides.

The Challenge

Here, we ask if the RNA World Hypothesis is consistent with what is known about extant biological systems, in particular the translation system. It is important to determine, at this time, what the translation system can tell us about the validity of the RNA World Hypothesis. Current data on the ribosome support the importance of RNA and RNA precursors in ancient systems but appear to challenge fundamental precepts of the RNA World Hypothesis. To explain, we start by providing background information on the translation system, and describe current models of its origins, evolution, function, and mechanism.

Universal Biology

The biological world uses three basic information transduction systems: replication (DNA to DNA), transcription (DNA to RNA), and translation (RNA to protein) (Crick 1970). In 1967, Carl Woese looked to the translation system to begin asking some of the deepest questions in biology (Woese 1967). Using translation as a window to peer back in time, Woese and Fox discovered that life on earth has arisen from three primary lineages (Woese and Fox 1977) as shown in Fig. 2.

Fig. 2
figure 2

The canonical tree of life with three primary lineages: bacteria, archaea, and eukarya. The tree of life is the inheritance pathway of the translation system, based on 16S rRNA sequences. The Last Universal Ancestor of Life (LUCA) is indicated

Woese and Fox succeeded in redrawing the long-standing tree of life that had been used by biologists for decades because the translation system has recorded and retained interpretable information on the ancient past. Numerous studies have now confirmed that molecular structures and chemical processes that directed the broad course of life on earth are contained in or imprinted on the translation system.

Translation

Translation is catalyzed by the ribosome. During translation, transfer RNAs (tRNAs) bridge the two ribosomal subunits. In the decoding center of the small ribosomal subunit (SSU), tRNA anticodons interact with mRNA codons. Around 70 Å away, in the peptidyl transferase center (PTC) of the large ribosomal subunit (LSU), a nascent peptide is transferred from the CCA 3′-tail of one tRNA to a cognate amino acid on another. The cognate amino acid is defined by the genetic code, as established by aminoacyl-tRNA synthetases, and sensed within the SSU. The nascent polypeptide passes through a long exit tunnel before exiting the LSU. In this way, the ribosome translates genetic information to protein. Both subunits are ribonucleoprotein complexes.

Outsourcing of Specificity

What is special about translation? Unlike replication and transcription, translation must rely upon indirect templating to transfer information. The specificity required to execute the genetic code is outsourced (discussed below), taking place by processes that are spatially remote from peptide bond formation, following a subtle and non-obvious logic.

Unlike replication and transcription, translation transduces information between dissimilar types of molecules (nucleotides and amino acids). This distinction helps explain why translation is far more complex than the other information transduction systems. Nucleotides pair directly with nucleotides during replication and transcription. However, neither nucleotides nor nucleotide triplets can pair with amino acids; the monomers and small oligomers of these dissimilar molecules are incapable of direct inter-species molecular recognition. No general molecular recognition code has been found in extant biology between triplet nucleotides and amino acids.

The Centrality of Translation in the Life of a Cell

Translation consumes vast resources. In rapidly growing S. cerevisiae, 60 % of transcription is devoted to rRNA production (Warner 1999). Fifty per cent of RNA pol II transcription and 90 % of mRNA splicing are devoted to rProtein production. Twenty of the thirty most abundant mRNAs in S. cerevisiae encode rProteins (Velculescu et al. 1997). Each nuclear pore of S. cerevisiae exports a ribosome every 2 s (Warner 1999). Protein synthesis consumes around 25 % of total energy in mature Bos taurus (Caton et al. 2000). In E. coli, translation is regulated by molecular interaction networks that dwarf the networks of transcription, replication or metabolism, in size, integration, and evolutionary conservation (Bu et al. 2003; Butland et al. 2005).

The translation system impacts, either directly or indirectly, essentially all cellular functions and processes (Fig. 3). Extensive dependencies built on this integration help explain why the size of the ribosome is an accurate proxy for organismal complexity (Fig. 4) (Petrov et al. 2014b). Bacteria and Archaea are relatively simple organisms compared to eukaryotes, and so have relatively small ribosomes. At the other end of the spectrum, mammals are arguably the most complex organisms on earth, and are characterized by the largest ribosomes. The ribosomes in Mammalia are regulated in more complex ways, participate in more functions, and interact with more partners than ribosomes of simpler systems.

Fig. 3
figure 3

a A standard Genome–Phenotype map showing a degenerate relationship between genome and fitness. This figure is adapted from Stadler and Stephens 2003. b The relationship of genome to phenotype is mediated by the translational system (triangle). For illustrative purposes, this panel shows a small sample relationship between genome and phenotype. A mutation can be silent (solid lines) if it is synonymous, and therefore does not change protein sequence, or can be non-synonymous (dashed line) and therefore does change the protein sequence and, potentially, the phenotype. c This panel shows the relationship between genome and phenotype. Some mutations can result in altered translation systems (circle), in the most extreme cases resulting in a non-canonical genetic code. These genomes would give rise to extreme changes in phenotype because all protein sequences would be altered. In these cases fitness would to fall to zero. By this mechanism, major changes to the translation system are precluded

Fig. 4
figure 4

Ribosomal size, but not genome size, is a proxy for complexity. This phylogenetic tree illustrates the explosion of ribosomal size in complex organisms and the lack of correlation of complexity with genome size. Circle radii are proportional to ribosome size (total length of LSU rRNA). The sizes of archaeal and bacterial LSU rRNAs are highly restrained, so they are represented by just one species each. The phylogram was computed using sTOL (Gough et al. 2001) and visualized with ITOL (Letunic and Bork 2011). This Figure is adapted from Petrov et al. 2014b

Operating Systems: Computer and Biology

By analogy to modern electronic computers, translation can be considered the operating system (OS) of life. (i) An OS is an essential part of any functioning computer (Stallings 2005). All living systems on earth have functioning translation systems. (ii) A computer OS mediates information flow between users/programs and computer hardware. The biological OS mediates information flow from RNA to protein, arbitrating the expression of genome to phenotype (Fig. 3). (iii) A computer OS is immense and complex, created in pieces with well-defined modules to receive inputs, provide outputs, and execute functions. Translation is performed by massive molecular assemblies with distributed and compartmentalized functions such as tRNA charging, peptidyl transfer and decoding. The translation system interacts directly or indirectly with all cellular processes, and is regulated by integrated molecular interaction networks. (iv) OS bugs and faults (errors) can cause widespread failure of many computer functions because of the dependency of essentially all computer functions on the OS. All protein production is dependent on proper function of the translation system. Drugs that cause even mild perturbation of ribosome function are lethal. A mutation that causes a significant change in the translation system, such as an alteration in the genetic code, would alter all protein in a cell and would profoundly impact all cellular structures and functions (Fig. 3). (v) A working OS can be highly resistant to change. It is not possible to change from one OS to another while a computer is running. It can be difficult to make market-wide changes to a broadly used OS because huge numbers of peripherals and programs can be rendered obsolete. Similarly, the dependency of all biological functions on the translation system imposes severe constraints on allowable changes.

The translation system is the most conserved element of biological systems. Altering the core structure and function of the translation system would cause death because all biological systems depend on translation. As stated by Francis Crick, “the code determines … the amino acid sequences of so many highly evolved protein molecules that any change to these would be highly disadvantageous…” (Crick 1968).

Universality Versus Novel Amendments

The complex, elaborate, and spatially distributed universal biology of translation has been revealed in increasing detail by high-resolution ribosome structures accumulating from all three domains of life (Amunts et al. 2014; Anger et al. 2013; Armache et al. 2010; Ban et al. 2000; Ben-Shem et al. 2010; Berk et al. 2006; Cate et al. 1999; Greber et al. 2014; Harms et al. 2001; Hashem et al. 2013; Jenner et al. 2005, 2012; Klein et al. 2001, 2004; Melnikov et al. 2012; Nissen et al. 2000; Rabl et al. 2011; Selmer et al. 2006; Sharma et al. 2003, 2009; Voss et al. 2006; Wimberly et al. 2000), structures of aminoacyl-tRNA synthetases (Guo and Schimmel 2012), and a massive and ever-expanding sequence database (Quast et al. 2013). Today, we know that translation is a unique province of unrivaled conservation among all branches of life. The translation system retains an interpretable molecular record of biology from before the last universal common ancestor (LUCA, Fig. 2) (Roberts et al. 2008; Woese 2001), and is an excellent guide to the world of primordial macromolecules (Bokov and Steinberg 2009; Hsiao et al. 2009, 2013b; Petrov et al. 2014b).

Rare exceptions to universality are observed in niche systems by minor variations of the canonical genetic code. Codon ‘capture’ and reassignment have been reported in mitochondria and other obligate bacterial symbionts (Knight et al. 2001b; McCutcheon et al. 2009), while mischarging of various tRNAs as methionyl-tRNA is found in representatives from each branch of the tree of life (Jones et al. 2011; Wiltrout et al. 2012). These tRNA transformations (mischarging, post-transcriptional modification, etc.) are protein mediated and do not involve codon–anticodon remodeling or changes in core ribosome structure or function (Knight et al. 2001a). They represent amendments to an otherwise universal genetic code that are entrenched in advanced, compensatory protein-based evolution, reflecting recent adaptations in the DNA/RNA/protein world. Universal biology must be recognized in the context of rare amendments that constitute minor perturbations on the arc of biology over billions of years.

Evolution of the Ribosome

The ribosome was fully functional at LUCA, forming a “common core” (Anger et al. 2013; Hsiao et al. 2009; Koonin 2014; Michot and Bachellerie 1987; Petrov et al. 2014b) that has been handed down to all living organisms. The common core rRNA, reasonably approximated by the rRNA of E. coli, is conserved over the entire phylogenetic tree (Hassouna et al. 1984; Hsiao et al. 2009; Mears et al. 2002; Melnikov et al. 2012; Michot et al. 1990), in sequence, and especially in secondary structure and three-dimensional structure.

The Ribosome

  • Synthesizes all coded protein (Steitz 2008; Trappl and Polacek 2011),

  • Uses a nearly universal code (Khorana 1965; Lu and Freeland 2006),

  • Contains universally conserved molecular structures (Gerbi 1996; Hassouna et al. 1984; Michot et al. 1990), assemblies (Hsiao et al. 2009), biopolymer sequences (Fournier et al. 2010; Wolf and Koonin 2007) and even magnesium ions (Hsiao and Williams 2009),

  • Catalyzes dehydration condensation, the ancient and universally conserved chemical process by which all biopolymers are synthesized (Rodnina et al. 2007; Schmeing and Ramakrishnan 2009; Simonovic and Steitz 2009),

  • Is the most ancient assembly in biology (Fox 2010; Woese 2000),

  • Has increased in size over billions of years by an accretion process that preserves the ancient core (Bokov and Steinberg 2009; Petrov et al. 2014b),

  • Is resistant to horizontal gene transfer and evolutionary change (Olsen and Woese 1993),

  • Is our most accurate proxy of biological complexity (Petrov et al. 2014b) (Fig. 4).

A series of models of LSU evolution are essentially in agreement despite different assumptions and types of input data. Harvey and Gutell compared sequences and secondary structures across multiple species, identifying the RNA components of the “minimal ribosome” (Mears et al. 2002). Fox analyzed the density of molecular interactions and interconnectivities (Fox 2010). Smith and Hartman analyzed the taxonomy of ribosomal proteins, along with their RNA interactions (Smith et al. 2008). Bokov and Steinberg analyzed A-minor interactions (Bokov and Steinberg 2009). Williams and coworkers treated the LSU as a growing onion (Hsiao et al. 2009), then used structural “insertion fingerprints” to infer a fine-grain stepwise building up of the common core (Petrov et al. 2014b). An ancestral PTC was proposed by Yonath and coworkers (Belousoff et al. 2010; Krupkin et al. 2011) based on symmetry considerations. There is a consensus from these models about many aspects of ribosomal origins and evolution.

Accretion of RNA Structure and Function

A consensus of models of LSU evolution suggests that in the early history of life on earth, small rRNAs begin a process of growth by accretion (Bokov and Steinberg 2009; Hsiao et al. 2009, 2013b; Petrov et al. 2014b). Peptides began to co-assemble with the rRNA. Building on a primitive PTC, successive RNA expansion elements joined pre-existing rRNA, enlarging it without perturbing the underlying rRNA structure. By LUCA, the accretion process (Fig. 5) had (i) buttressed and elongated the peptide exit tunnel, (ii) added the E site, (iii) added the SSU-LSU interface (Petrov et al. 2014b), (iv) conferred mRNA translocation capability, and (v) added rRNA components that facilitate targeting and translocation of nascent proteins through membranes (Zimmermann et al. 2011).

Fig. 5
figure 5

The evolution of Helix 25/ES 7 of the LSU rRNA shows serial accretion of rRNA onto a frozen core. This image illustrates at the atomic level how Helix 25 of the LSU rRNA grew from a small stem loop in the common core into a large rRNA domain in metazoans. Each accretion step adds to the previous rRNA core but leaves the core unaltered. Common ancestors are indicated. Pairs of structures are superimposed to illustrate the differences, and to demonstrate how new rRNA accretes with preservation of the ancestral core rRNA. Each structure is experimentally determined by X-ray diffraction or Cryo-EM. This figure is reproduced from Petrov et al. 2014b

The accretion process left an extensive trail of molecular fossils. By way of common ancestry, the human LSU rRNA appears to contain a buried fruit fly rRNA, which in turn contains a buried yeast rRNA, which in turn contains a buried bacterial rRNA, which in turn contains a series of ever more ancient buried pre-LUCA rRNAs. At the core is the PTC, frozen in time for billions of years, with structure and function that is invariant throughout time and throughout the tree of life. The PTC is inherited by all living systems on earth from an ancient biology that was inaugurated before the introduction of coded protein and the development of the genetic code. The PTC and the exit tunnel were relatively mature when the subunit interface was acquired (Bokov and Steinberg 2009; Fox 2010; Petrov et al. 2014b).

Accretion in the Modern Era

Accretion has been on-going for more than 3.5 billion years, and even now, within the eukaryotic kingdom, the ribosome continues to grow by accretion (Figs. 5, 6) (Petrov et al. 2014b). In eukaryotic systems, accretion has added rRNA ‘expansion elements’ (Anger et al. 2013; Ben-Shem et al. 2010; Gerbi 1996; Hashem et al. 2013; Hassouna et al. 1984) that appear to recruit complex eukaryotic initiation, elongation and termination factors. rRNA expansions facilitate protein processing and modification, chaperone-assisted folding, delivery to the endomembrane system and biogenesis. In Mammalia, ribosomes contain immense rRNA polymers of nearly unimaginable structural complexity, with total atomic masses of well over 4,000,000 Daltons (Figs. 5, 6).

Fig. 6
figure 6

LSU rRNA secondary structures (Petrov et al. 2013, 2014a) from four species of varying complexity. rRNA domains are indicated by color. Secondary and three-dimensional structures are more highly conserved than sequence. By way of common ancestry and the accretion process of rRNA growth, at the level of secondary and three-dimensional structure, there is, roughly, an E. coli rRNA within S. cerevisiae rRNA, a S. cerevisiae rRNA within D. melanogaster rRNA, and a D. melanogaster rRNA within H. sapiens rRNA. These are extant molecules that have evolved from common ancestors, not from each other. These images are available at: http://apollo.chemistry.gatech.edu/RibosomeGallery/

The Evolution of Enzymology

As proposed by Pauling, enzymes (which here include protein enzymes and RNA enzymes) increase reaction rates by stabilizing transition states (Pauling 1946). The free energy of stabilization of a transition state is provided by the folding energy of the macromolecule and the free energy of substrate binding, which in combination organize molecular interactions to complement those of the transition state. A transition state has a fleeting existence of less than a femtosecond and cannot be captured or directly observed by standard chemical means. Highly tuned molecular recognition of transition states, commonly under allosteric control, is a crowning achievement of biological evolution.

Water In—Water Out

Like all biological polymers, proteins are synthesized by condensation–dehydration reactions (Fig. 7), an ancient type of transformation that predates biology. Although condensation–dehydration occurs in the PTC, specificity and regulation are achieved by a unique and elaborate system that is spatially distributed and distinct from the PTC. Specificity and regulation are distributed among tRNAs, mRNAs, aminoacyl-tRNA synthetases (AARSs), translation factors, and the SSU. The PTC is a conspicuously docile player in the execution of coded translation. The delegation of regulation and specificity to spatially remote components of the translation system reveals significant information about the evolutionary history of the LSU. We believe the distributed nature of regulation and specificity is a hallmark of the primitive origins of the PTC.

Fig. 7
figure 7

Biopolymer synthesis. Biological macromolecules are built by condensation–dehydration reactions. Net reactions in the synthesis of a polypeptide, b polynucleotide, and c polysaccharide

To illustrate the unique distributive process of translation, via spatially distributed specificity, here, we compare and contrast ribosome-catalyzed amino acid polymerization with enzyme-catalyzed polypeptide depolymerization (i.e., peptide bond hydrolysis). The comparison has great utility because an enzyme equally catalyzes forward and reverse processes. The translation machinery and a serine protease stabilize the same transition states, and so on a fundamental chemical level the translation machinery is protease acting in reverse. However, the contrasts between the mechanisms of transition state stabilization and of regulation and control between the ancient ribosome and the (relatively) modern serine proteases are striking.

Breaking Peptide Bonds

Proteases are highly sophisticated enzymes that use tuned molecular interactions to specifically stabilize transition states. Proteases contain ‘cryptate-like’ networks of molecular interactions (Robertus et al. 1972; Warshel et al. 1989) that are pre-organized to electrostatically complement transition states. These enzymes demonstrate the power of Darwinian evolution in organizing molecular interactions of a folded macromolecule to complement those of a transition state and to manipulate energy landscapes of chemical transformations.

The specificity of a serine protease is not outsourced as in the translation machinery, but is localized in three-dimensional space, on a single polypeptide chain. Interactions that control specificity and catalytic efficiency are found within the catalytic cleft. A serine protease begins the process of cleaving a peptide bond by binding non-covalently to a peptide substrate. A protease can interact sequence specifically with its substrate, selecting some peptide bonds but not others for cleavage, based on a large extent on the “primary specificity pocket”. The protease uses a hydroxyl group as a nucleophile, to attack an electron deficient C′ atom of the substrate (Fig. 8). The hydroxyl group is activated as a nucleophile by facile transfer of its proton to nearby Histidine 57, which is stabilized in the protonated state by Aspartate 102. The geometrically poised serine, histidine, and aspartic acid are known as the catalytic triad, a combination so powerful and useful that it has arisen repeatedly by convergent evolution (Ekici et al. 2008). The nucleophilic attack generates a tetrahedral intermediate then an acyl-enzyme intermediate. The oxyanion of the tetrahedral intermediate is stabilized by an ‘oxyanion hole’ on the enzyme. The oxyanion hole, like the catalytic triad, is a generally useful construct for stabilizing transition states.

Fig. 8
figure 8

Sophisticated catalysis, including specific transition state stabilization by a serine protease. Serine 195 attacks the C′ at the scissile bond and simultaneously transfers a proton to Histidine 57. Aspartate 120 stabilizes the cationic form of Histidine 57, while two backbone NH groups stabilize the substrate oxyanion. These interactions decrease the activation energy, increasing the reaction rate. This figure was constructed in collaboration with Dr. James C. Powers

An important energetic feature distinguishes the PTC from a protease. The forward reaction for peptide bond formation (the ribosome reaction) uses activated amino acids because the net reaction is uphill. The reverse direction (the protease reaction) is downhill and does not use activated substrates.

Making Peptide Bonds

In contrast to a protease, the catalytic core of the ribosome lacks the characteristics of a highly specific modern enzyme. Within the PTC, a nucleophilic amino group of one substrate attacks an electron deficient C′ of another, ultimately linking them by a peptide bond (Leung et al. 2011). The PTC appears not to specifically stabilize the transition state as in a protease active site (Carrasco et al. 2011) and is instead a simple entropy trap (Schroeder and Wolfenden 2007; Sievers et al. 2004). It brings two substrates into close proximity. The ribosome accelerates the transferase reaction by reducing translational and rotational degrees of freedom of substrates and by modulating solvation entropy. In the PTC, there are no structural elements with analogy to the catalytic triad or the oxyanion hole. The PTC does not form an acyl-enzyme intermediate. The PTC lacks pre-organized cryptate-like networks of molecular interactions.

The PTC is a low-specificity enzyme that has maintained the ability to produce a wide variety of condensation products including peptides, esters, and thioesters (Fahnestock et al. 1970; Fahnestock and Rich 1971; Hartman et al. 2007; Kang and Suga 2008; Ohta et al. 2008; Subtelny et al. 2008; Tan et al. 2004; Victorova et al. 1976). Sidney Hecht reported ribosomal reaction products with altered connectivity, resulting from nucleophilic attack at other than the usual C′ atom (Roesser et al. 1986). He originated the hypothesis that the PTC speeds up reactions but does not perform standard catalytic functions (i.e., it does not specifically stabilize the transition state). Such properties may be emblematic of the greatest feat of ribosome evolution. The ribosome efficiently catalyzes peptide synthesis, while achieving a lack of specificity that allows tRNAs charged with amino acids of dramatically different sizes and polarities to participate in protein synthesis on an equal footing (Ledoux and Uhlenbeck 2008), and to produce peptides that range in chemical properties from positively charged, to negatively charged, to hydrophobic.

Accretion of Specificity and Regulation

With such a functionally docile catalytic region, how does translation achieve the levels of specificity and regulation required to carry out the genetic code, which overall are extremely stringent? Regulation and specificity are spatially distributed, accomplished by factors that are well separated in three-dimensional space, and in some cases in time, from the catalytic processes within the PTC. The specificity and regulation that characterize translation are achieved by aminoacyl-tRNA synthetases (Schimmel 2008), the decoding center of the SSU (Demeshkina et al. 2013), elongation factors (Dale and Uhlenbeck 2005), localized folding propensities of mRNAs (Dvir et al. 2013), folding of the nascent protein (Kaufman 2004), and many other factors and phenomena. Aminoacyl-tRNA synthetases are enzymes that enforce the genetic code by covalently attaching amino acids to their cognate tRNAs (tRNAs with the appropriate triplet anticodons). Elongation factors proofread the synthetase reactions. The small subunit uses tRNAs to interpret the mRNA and direct the appropriate tRNA-charged amino acid to the PTC.

It may appear paradoxical that one of the most important enzymes and most complex, highly integrated, and regulated assemblies in all of biology is, at its catalytic core, an unsophisticated enzyme. One could argue that this lack of specificity in the PTC, unique among extant enzymes, is sophisticated. The lack of specificity of the PTC itself, coupled with a broad spatial distribution of translational specificity functions reflects both evolutionary history and requirements of modern translation. It seems that the PTC, the core of the ribosome, formed and froze before the biological invention of sophisticated enzymes. The ribosome grew in size, function, and complexity via accretion processes. The PTC retained the ability to perform non-specific catalysis and to link any of the canonical amino acids required for modern biology with near equal efficiency.

Molecular Widgets

The nature of PTC catalysis suggests a primitive origin, resulting from chemical evolution, with elaboration by an accretion process and increased specificity coupled to cooperation with other molecular entities (e.g., AARSs or their predecessors) that did not permit evolutionary remodeling of aboriginal structural elements. We propose that the ancestral PTC, lacking a support system for regulation and specificity, was a molecular ‘widget’ maker (Fox 2010; Hsiao et al. 2009; Petrov et al. 2014b), producing heterogeneous non-coded oligomers at differential rates determined by the availability of substrates and environmental factors. The heterogeneous oligomers produced by this non-specific entropy trap would have been racemates of peptides, esters (Rich 1962), thioesters, and other condensates. A subpopulation of these oligomers, possibly a very small subpopulation, bound to the PTC and conferred advantage by stabilizing the assembly. The exit tunnel was an early development, which was continuously improved and extended over early LSU evolution (Fox et al. 2012; Petrov et al. 2014b) to allow Brownian synthesis of increasingly longer oligomers.

The accretion process of ribosomal evolution has involved not only RNA, but also protein. The conformations and relative populations of ribosomal protein components near the LSU core have been frozen and preserved by accretion, and can be interpreted as molecular fossils of the oligomers that were selected from a pool of non-coded heterogeneous oligomers whose short length and chemical composition proscribed secondary structure (Hsiao et al. 2009), but which could assume structures that allowed them to bind to the early LSU.

The heterogeneous oligomers ‘fossilized’ over time into coded protein. Here by fossilize we mean a process by which the original heterogeneous oligomers were replaced incrementally by ever more homogeneous products of the PTC. The final culmination of this process was replacement by coded polypeptide containing the 20 extant homochiral amino acids. In this model, products of the PTC were incrementally less diverse over time. They were continuously selected by their abilities to stabilize the primitive ribosome. In this scenario, selection may have been on the level of protection from degradation. Proto-peptide that optimally associated with the PTC conferred greater stability and chemical productivity to the PTC, and selectivity for synthesis of peptides. By this process, as the LSU grew in size and sophistication, the original heterogeneous oligomers that bound to and stabilized the PTC were gradually converted into the non-canonical tails of ribosomal proteins that penetrate deep into the extant LSU core.

Regulatory and specificity factors, as they were acquired, were accumulated at the periphery of the growing ribosome, via the accretion process, which precluded remodeling of the ribosome, but not fossilizing of associated polypeptides or the predecessors.

The Ribosome Challenge to the RNA World

Here, we discuss the RNA World Hypotheses in the context of the origins, evolution, mechanisms, and functions of the translation system. Conventional arguments against the RNA World Hypothesis have centered on (i) difficulties in formulating abiotic routes to RNA precursors, (ii) low frequency of catalysis in RNA sequence space, (iii) chemical instability of RNA polymers, and (iv) limited catalytic repertoire of RNA in vivo. These criticisms have been discussed (Bernhardt 2012) and will not be elaborated further here.

Our current understanding of the translation system presents challenges to multiple aspects of the RNA World Hypothesis. We do not argue that an RNA World is beyond the realm of possibility and should be discarded as a viable model. We do argue that some aspects of the RNA World Hypothesis are inconsistent with available data and are not parsimonious.

Making Exceptions

In the RNA World Hypothesis, most ribozymes were rendered redundant and went extinct. The entire metabolic system based on ribozymes was extinguished. All traces of ribozyme RNA polymerases, the defining catalytic systems of the RNA World, have been erased from the phylogenetic record. Yet the ribosome remains, permanent and universal.

If the RNA World Hypothesis is correct, then the Polymer Transition discriminated wildly in selecting some ribozymes for extinction while bypassing others. It has been said that the ribosome and RNase P (Mondragon 2013) are the only multiple turnover ribozymes that escaped extinction. If the RNA World Hypothesis is correct, then the ribosome was minted, then immediately and permanently immunized against extinction and evolutionary remodeling, while other catalytic ribozymes were incrementally phased out by superior protein-based analogs and ultimately erased from the phylogenetic record.

The determination that the ribosome is a ribozyme is commonly taken as support for the RNA World. In fact, that interpretation is subject to debate. The driver of the hypothetical Polymer Transition is the catalytic superiority of protein enzymes over RNA enzymes. Inexplicably, the ribosome was immune to the Polymer Transition. An alternative to the RNA World is that the PTC was first. There simply were not any sophisticated enzymes (ribozymes or other catalytic polymers) predating the emergence of the ribosome. In this scenario, the ancestral ribosome arose when building blocks for RNA, or proto-RNA, were provided by abiotic processes.

Temporal Disorder

A timeline for the RNA World is problematic when the ribosome is incorporated. In the RNA World Hypothesis, the PTC arose in a sea of sophisticated ribozymes, including ribozyme polymerases. However, the mechanism and structure of the PTC are primitive outliers in the universe of biological enzymes, signaling origins via chemical rather than biological evolution. The PTC is simply an entropy trap, falling in the primeval margins of the definition of an enzyme. The primitive nature of the ribosomal core is indicated by the comparison of the catalytic mechanism and regulation of the ribosome with those of modern protein enzymes.

The PTC appears to predate catalytic/allosteric biology. PTC does not appear representative of the enzymatic power and sophistication required for ribozymes to maintain an energy transducing and self-replicating system of RNA polymers capable of Darwinian evolution. The nature of the primitive core of the ribosome, as we now know its structure and function, cannot be reconciled with its origins in the context of a functioning enzymatic milieu.

Changing the Toolset, Rebooting the System

During the putative Polymer Transition, which would have occurred in the context of Darwinian evolution, translation took root, gained utility, and assumed its current status of centrality, preeminence, and universality (Fig. 1). An entirely new type of biopolymer (coded polypeptide) was invented and took catalytic control. New biochemistries, including biosynthesis of amino acids, charging of tRNAs and synthesis of coded polypeptide, were introduced and became fundamental and essential. Biology’s information transduction and metabolic systems were entirely replaced.

However, we know that evolution does not work that way. Biopolymer backbones in extant biology are simply not subject to change by Darwinian evolutionary processes. Evolution improvises by altering sequences of pre-existing biopolymers. The diverse morphology of the eukaryotic kingdom, ranging from protists to whales, and the diverse metabolism of the microbial world, ranging from methanogens to sulfur oxidizers, is united by common biopolymer backbones, with differing sequences.

The Polymer Transition rebooted biochemistry causing a profound and fundamental conversion of biology: the introduction of a new biopolymer backbone. To use our previous analogy to computers, during the Polymer Transition, a midstream change in the operating system of life was accomplished. Biology was reinitialized.

As stated by Francois Jacob, evolution is a “tinkerer… that does not produce novelties from scratch” (Jacob 1977). Evolution, as we know it, has not made radical transformations or performed fundamental rewiring as described for the Polymer Transition. We do not suggest that the Polymer Transition was impossible, just that it is not characterized by high parsimony.

Testing the Poole Hypothesis

Extant biology thus far does not provide strong empirical support for the Polymer Transition. Poole proposed that the Polymer Transition was an incremental process in which ribozymes were replaced by ribonucleoprotein enzymes, which were then replaced by protein-based enzymes (Jeffares et al. 1998; Poole et al. 1998). The continuity principle would require incremental changes from RNA to protein, while continuously preserving functionality.

One can look to extant biology to seek examples of Poole transitions. The mitochondrion, an α-proteobacterial endosymbiont within the eukaryotic cell, provides a promising candidate environment because it has been subject to intense evolutionary pressures leading to numerous gain and loss events. The mitochondrial translation system shows far greater diversity in AARSs, tRNAs, rRNAs, and genetic code than observed for nuclear-encoded translation systems (Watanabe 2010) or for non-endosymbiont microbes. In many organisms, mitochondrial tRNAs and rRNAs have been substantially whittled down over 1.5 billion years of mitochondrial evolution. In some organelles (mitosomes), the whittling reached a final conclusion: ribosomes have been lost altogether (Gray 2012), rendered obsolete in a compensating cytoplasmic environment. Omitted RNA elements in mitochondria provide model systems for studying the Polymer Transition. Deleted rRNA is indeed compensated for by protein additions on both structural and functional levels.

However, contrary to the predictions of the Poole proposal, the catalytic portions of RNAs are cleanly excluded from replacement processes (Amunts et al. 2014; Greber et al. 2014; Sharma et al. 2009). None of the rRNA in or immediately surrounding the PTC or decoding center has been replaced by protein in mitochondrial ribosomes. RNase P, a universally distributed ribozyme originating in LUCA, shows a pattern similar to mitochondrial ribosomes. The RNA-based catalytic domain of RNase P is universally conserved, while peripheral RNA has been replaced by protein in some organisms (Mondragon 2013). Fully proteinaceous RNase P has been found in mitochondria. Thus far, data suggest that these proteins represent a full replacement of one enzyme (a ribozyme) by a patchwork of pre-existing protein enzymes, rather than via an incremental change within a given enzymatic system (Holzmann et al. 2008). Nevertheless, RNAse P appears to be an excellent candidate system for testing the Poole Hypothesis.

Thus far, to our knowledge, there are no reported explicit examples in which nature has incrementally converted a catalytic site from an inferior to a superior catalytic polymer. It will be useful to seek out other systems, and to look for verifiable examples of catalytic systems that have made the Poole transition.

Evolution Before the Darwinian Threshold

Some descriptions of the RNA World Hypothesis and other models for the earliest stages of life, e.g., Eigen’s influential Quasispecies model (Eigen 1993) include enzymes and processes drawn from extant biology (e.g., polymerases, genetics, Darwinian evolution and information) as part of attempts to bridge the gap between a ‘prebiotic soup’ of small molecules and a polymeric system capable of evolution. The early inclusion of enzymes similar to those found in life today must rely upon, in our opinion, improbable sequences of events. For example, a central element of the RNA World Hypothesis is a ribozyme that is able to copy itself by acting as a processive polymerase. It is important to keep in mind that it is merely a hypothesis that a self-replicating ribozyme was an essential feature of early life. In vitro evolution experiments designed to produce such a ribozyme have shown that its selection is extremely difficult (Attwater et al. 2013; Robertson and Joyce 2014; Shechner and Bartel 2011; Vaidya et al. 2012; Wochner et al. 2011), and the sequence required would likely to be much longer than any RNA polymers that could have formed spontaneously on the prebiotic earth.

The notion that an RNA polymerase was among the earliest enzymes appears to be an extrapolation from extant life back to what is considered by some to be the minimal entity capable of Darwinian evolution. Dawkins used this extrapolation in his popular book The Selfish Gene (Dawkins 2006) in which he stated that there must have been an original ‘replicator.’ An alternative to evolution via a single catalytic polymer is the possibility that the first polymers of life (e.g., proto-RNA, proto-peptides) were selected by their intrinsic propensity to self-assemble (Cafferty et al. 2013; Hud et al. 2013). Chemical evolution could have been driven by non-enzymatic template-directed replication and functional selection by geophysical cycles (e.g., day–night, wet–dry, hot–cold, freeze–thaw). Although such a process has not yet been demonstrated experimentally, alternative monomers and reactions are being found that support the possibility that proto-RNA and proto-polypeptides could have formed in simple drying–heating reactions (Cafferty and Hud 2014; Chen et al. 2014). Additionally, theoretical studies have indicated the potential for functional evolution to take place within a pool of random informational polymers if there is a repeating cycle in which one phase promotes sequence-independent polymer replication, with an alternating phase of limited polymer hydrolysis and monomer recycling, all in an environment of low diffusivity (e.g., a viscous solvent) (Walker et al. 2012). These simulations show that it is at least possible for polymers with a favorable function, such as increasing the concentration of a nucleotide in short supply, to become established in a population of non-functional sequences that all have the same propensity for replication. Additionally, simulations of this model show that polymers with different functions can work together synergistically, over distance and time, to improve the overall fitness of the pool of polymers (Walker et al. 2012).

This possibility of early cooperation between polymers of different constitutions and with different functions could have allowed for the simultaneous emergence of polymers with the activities necessary to start life. Based on evolutionary data, Woese (Woese 2002) concluded that cellular life was preceded by a time of ‘supramolecular aggregates’, which represented a time when functional polymers worked together to solve many of the chemical problems associated with the emergence of life. Woese called the transition between this time in chemical evolution and the start of cellular life The Darwinian Threshold. Woese was also convinced that translation (including the ribosome) is older than RNA transcription, which is likewise older than DNA replication. We see Woese’s chronology of early evolution as being consistent with the proposal that geophysical processes were responsible for biopolymer replication until the time that protein-based polymerases were synthesized by the ribosome.

Biopolymer Mutualism and Coevolution

What we know about the ribosome is most consistent with biopolymer mutualism. In this model, the ancestors of polypeptide fostered the chemical evolution of ancestors of polynucleotide and ancestors of polynucleotide fostered the chemical evolution of ancestors of polypeptide. RNA and protein coevolved, via chemical processes, from more primitive ancestors. In this model, RNA has always synthesized protein and protein has always synthesized RNA. The co-dependence of RNA and protein was built-in from the ground up, not ex post facto. Fundamentally, new biopolymers were not introduced once the transition from chemical to biological evolution was complete. Although mutualism models do not invoke an RNA-based RNA polymerase, they retain the premise that polynucleotide was a central polymer of primitive biology and simply add polypeptide as an equal partner.

In contrast to the RNA World Hypothesis, in mutualism models there is no fundamental rewiring of biology. There are no wholesale extinctions of information transducing and metabolic systems. The empirical rules of evolution are not changed. Consistent with the principle of Ockham’s razor, the operating system of life was not rewired.

In mutualism models, the ribosome originated by chemical evolution and began catalytic function in a chemical rather than biological environment. Formation of the ancestral PTC was a seminal event in the origin of life. The evolution of life on earth is linear and monophasic (Fig. 9), not branched and biphasic as in the RNA World (Fig. 1). Arguments for a similar RNA/protein mutualism have been made by others (Carter et al. 2014; Li et al. 2013) based on the catalytic competence of peptide models of primitive AARSs.

Fig. 9
figure 9

Timeline of RNA–protein Mutualism. In this model, the history of life on earth is monophasic. Chemical evolution merged smoothly and continuously with biological evolution. Chemical evolution produced the primitive ribosome (a proto-RNA ribozyme), which via chemical evolution produced the first crude RNA polymerase (a proto-protein enzyme). There are no abrupt departures, reinitializations, radical changes of course or wholesale extinctions

There is ample precedent for the concept of mutualism. On the organismal level, mutualism occurs when multiple species benefit from and depend on their association with each other. Figs (Ficus spp., Moraceae) and pollinating wasps (Agaonidae, Chalcidoidea) form an integrated pollination mutualism (Machado et al. 2005). Aphids and their obligate bacterial symbionts (Buchnera and Uroleucon) form highly intimate mutualisms (Clark et al. 2000).

Mutualism is associated with coevolution. Organisms in mutual relationships change over time in coordinated and mutually beneficial ways and are often vitally interdependent. The formalisms of mutualism and coevolution are familiar and applicable at the molecular level as well. In the simplest example, pairs of nucleotides in rRNAs and tRNAs are seen to co-vary over phylogeny (Noller et al. 1981; Woese et al. 1980). The correct function of one base depends on another. The bases depend on each other for complementary pairing.

The Chicken and the Egg Reprise

A frequently cited dilemma in origin of life discussions is the chicken and the egg. A biological system that depends on a single polymer for both genetics and catalysis seemingly avoids what would be the impossible task of simultaneous whole cloth invention of two functional biopolymers, one encoding the other.

However, the chicken and egg problem in the context of RNA and protein is solvable in the origin of life, in just the way it was solved by the actual chicken and the actual egg. The incremental transition from proto-chicken to chicken paralleled the incremental transition from proto-egg to egg. The chicken and the egg came together, by microscopic incremental steps. Neither would be possible without the other. The question of which came first has no biological significance; neither the chicken nor the egg arrived alone, or suddenly, or from whole cloth.

We suggest that the macromolecules of life, both polynucleotide and polypeptide polymers, are the ultimate products of mutualism and chemical co-evolution. A prebiotic world containing proto-RNA precursors most assuredly contained an assortment of amino acids, peptides, oxyacids, esters, sugars, polysaccharides, and lipids (Callahan et al. 2011; Schmitt-Kopplin et al. 2010). Diverse molecules could associate and influence chemical evolution as a cooperative. By contrast, it seems unlikely, as noted by Cech, that the earliest beginnings of ancestral biochemistry had the wherewithal, or a driving force, to actively exclude all but RNA (Cech 2009), as in Gilbert’s original proposal for the RNA World (Gilbert 1986).

The translation system contains the most ancient macromolecular structures available to us for study. The structure, function, and evolution of the translation system are consistent with a monophasic model for the origin of life. The ribosome suggests that, just as current translation is the operating system of extant biology, the ancestor of the translation system was the operating system of ancestral biology. RNA and protein arrived together, by incremental processes of chemical evolution, just like the actual chicken and the actual egg arrived together by incremental processes of biological evolution.

The Evolution of Evolution: Chemistry to Biology

The first enzymes on the path to life, whatever their compositions and functions, were, by definition, produced by non-enzymatic, non-biological processes. We call the processes that produced the first enzymes ‘chemical evolution’. Chemical evolution initiated and proceeded in the absence of polymerases, heredity, genetic information, and Darwinian evolution. Polymerases, heredity, and genetics are the products of chemical evolution. Biological evolution is a product of chemical evolution.

The creative potential of chemical evolution cannot rival that of biological evolution. However, we argue that the initial steps of essentially any reasonable model of the origin of life are dependent to some degree on the creative potential of chemical evolution. In our view, it seems likely that chemical evolution converted smoothly and continuously to biological evolution, yielding in the process much of the molecular toolbox upon which extant biology is built.

We lack good enzyme-free, experimental models for chemical evolution, although simulations indicate that evolution is possible if abiotic reactions can be found that promote template-directed synthesis (Walker et al. 2012). Clearly, central questions related to ancient biology and the origin of life center on polymerases, replication, and genetics. Relevant questions are:

  • What is ‘evolution’ in the absence of replicative enzymes, heredity, and genetic information? What drives chemical evolution, what are the mechanisms, and what is the creative potential?

  • What chemical evolutionary process drove emergence of the ancestral PTC?

  • What evolutionary forces drove PTC-mediated production of catalytically competent peptide/protein enzymes capable of replicating proto-RNA?

  • What are the roles of ancestral replication, transcription and translation processes during the transition from chemical evolution to biological evolution?

In many variants of the RNA World Hypothesis, chemical evolution produced the initial replicase (the putative RNA-based RNA polymerase). We simply refocus that feature of the RNA World Hypothesis and suggest that chemical evolution produced the ancestor of the PTC and then, during a gradual transition from chemical to biological evolution, assisted in producing the first replicative enzyme, which we propose is a protein-based ancestor of nucleic acid polymerases. In this model RNA (or proto-RNA) in the ribosome produced protein (or proto-protein), which began producing RNA (or proto-RNA). We are not arguing against the importance of RNA in ancient biochemistry, but we are suggesting that other polymers were critical partners.

Summary

We do perceive certain inconsistencies between current RNA World models and our best information on and models of the origins, evolution, and function of the ribosome. We believe our community is accumulating, and should communicate, information that will allow refinement of broadly accepted models. This document is an attempt to initiate this process. It is clear that the translational system will increasingly provide a platform for hypothesis testing of predictions of origin of life models.